Commit graph

189 commits

Author SHA1 Message Date
Terts Diepraam
105b0db251 rmdir, split: add fs feature to uucore dep 2022-08-20 15:13:18 +02:00
Terts Diepraam
15180249fc Version 0.0.15 2022-08-20 13:13:22 +02:00
Owen Anderson
9fad6fde35
Fix a bug in split where chunking would be skipped when the chunk size (#3800)
* Fix a bug in split where chunking would be skipped when the chunk size
happened to be an exact divisor of the buffer size used to read the
input stream.

The issue here was that file was being split byte-wise in chunks of 1G.
The input stream was being read in chunks of 8KB, which evenly divides
the chunk size. Because the check to allocate the next output chunk was
done at the bottom of the loop previously, it would never occur because
the current input chunk was fully consumed at that point. By moving the
check to the top of the loop (but still late enough that we know we have
bytes to write) we resolve this issue.

This scenario is unfortunately hard to write a test for, since we don't
explicitly control the input chunk size.

Fixes https://github.com/uutils/coreutils/issues/3790
2022-08-16 11:02:52 +02:00
Daniel Hofstetter
7c3116330e Replace deprecated is_present() with contains_id() 2022-08-02 15:21:39 +02:00
Daniel Hofstetter
fc4544c42b bump clap from 3.1.18 to 3.2.15 2022-07-29 14:05:02 +02:00
Andrew Baptist
f2cfc15a70 split: Don't overwrite files
Check that a file exists by calling create_new and changing the
interface of instantiate_current_writer to return a Result rather
than calling unwrap.
2022-07-21 12:06:13 -04:00
Daniel Hofstetter
2261051239 split: set names for arg values 2022-05-30 09:10:18 +02:00
Terts Diepraam
eae07adfb1
Version 0.0.14 (#3553)
Version 0.0.14
2022-05-22 19:57:19 +02:00
Terts Diepraam
0acfa07d77 all: add value hints 2022-05-13 16:15:50 +02:00
Terts Diepraam
c6c936f529 all: remove explicit imports of TryFrom and TryInto
This is enabled by the changing the edition from 2018 to 2021
2022-04-05 10:39:31 +02:00
Terts Diepraam
af9f718936 Change edition to 2021 2022-04-05 10:39:31 +02:00
Terts Diepraam
b7809bd889 version 0.0.13 2022-04-02 11:04:27 +02:00
Jeffrey Finkelstein
e357d2650c clippy fixes from nightly rust 2022-03-22 21:44:33 -04:00
Sylvestre Ledru
04b219bdef
Merge pull request #3229 from uutils/dependabot/cargo/clap-3.1.6
build(deps): bump clap from 3.0.10 to 3.1.6
2022-03-20 17:44:33 +01:00
Jeffrey Finkelstein
b79ff6b4fd split: avoid writing final empty chunk with -C
Fix a bug in which a final empty file was written when using `split
--line-bytes` mode.
2022-03-20 09:30:58 -04:00
Jeffrey Finkelstein
95f58fbf3c split: handle no final newline with --line-bytes
Fix a panic due to out-of-bounds indexing when using `split
--line-bytes` with an input that had no trailing newline.
2022-03-19 23:50:02 -04:00
Sylvestre Ledru
1d17400b80
Merge pull request #3274 from jfinkels/split-elide-empty-files
split: elide all chunks when input file is empty
2022-03-19 22:52:57 +01:00
Jeffrey Finkelstein
0a226524a6 split: elide all chunks when input file is empty
Fix a bug in the behavior of `split -e -n NUM` when the input file is
empty. Previously, it would panic due to overflow when subtracting 1
from 0. After this change, it will terminate successfully and produce
no output chunks.
2022-03-19 14:32:28 -04:00
Jeffrey Finkelstein
6d2eff9c27 split: catch and handle broken pipe errors
Catch `BrokenPipe` errors and silently ignore them so that `split`
terminates successfully on a broken pipe. This matches the behavior of
GNU `split`.
2022-03-19 12:11:03 -04:00
Sylvestre Ledru
3c88fb460b
Merge branch 'main' into dependabot/cargo/clap-3.1.6 2022-03-19 09:26:05 +01:00
Sylvestre Ledru
9796e01df6
Revert "split: implement round-robin arg to --number" 2022-03-18 14:45:29 +01:00
Terts Diepraam
20212be4c8 fix clippy errors related to clap upgrade from 3.0.10 to 3.1.6 2022-03-17 22:46:56 +01:00
dependabot[bot]
59440d35c0
build(deps): bump clap from 3.0.10 to 3.1.6
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.10 to 3.1.6.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.10...v3.1.6)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-17 13:06:29 +00:00
Jeffrey Finkelstein
18bfd1ac68 split: implement round-robin arg to --number
Implement distributing lines of a file in a round-robin manner to a
specified number of chunks. For example,

    $ (seq 1 10 | split -n r/3) && head -v xa[abc]
    ==> xaa <==
    1
    4
    7
    10

    ==> xab <==
    2
    5
    8

    ==> xac <==
    3
    6
    9
2022-03-15 18:22:44 -04:00
Sylvestre Ledru
bfd1e14137
Merge pull request #3204 from jfinkels/split-line-bytes
split: implement --line-bytes option
2022-03-12 09:45:07 +01:00
Jeffrey Finkelstein
77d92883c7 split: implement --line-bytes option
Implement the `--line-bytes` option to `split`. In this mode, the
program tries to write as many lines of the input as possible to each
chunk of output without exceeding a specified byte limit. The new
`LineBytesChunkWriter` struct represents this functionality.
2022-03-10 22:51:49 -05:00
Jeffrey Finkelstein
b42168e9dc Clippy fixes in multiple crates 2022-03-10 22:31:21 -05:00
Sylvestre Ledru
54a10e955a Update of the cargo.lock url to point to the right branch 2022-03-06 22:13:17 +01:00
Davide Cavalca
19af43222b Include license text in all published crates 2022-03-05 21:21:46 +01:00
Jeffrey Finkelstein
ee36dea1a9 split: implement outputting kth chunk of file
Implement `-n l/k/N` option, where the `k`th chunk of the input file
is written to stdout. For example,

    $ seq -w 0 99 > f; split -n l/3/10 f
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
2022-03-05 10:27:51 +01:00
Sylvestre Ledru
346cfa060b
Merge pull request #2980 from jfinkels/split-lines-2
split: add support for "-n l/NUM" option to split
2022-03-01 10:13:44 +01:00
Jeffrey Finkelstein
dbbee573ab split: add support for "-n l/NUM" option to split
Add support for `split -n l/NUM`. Previously, `split` only supported
`-n NUM`, which splits a file into `NUM` chunks by byte. The `-n
l/NUM` strategy splits a file into `NUM` chunks without splitting
lines across chunks.
2022-02-22 18:44:08 -05:00
Jeffrey Finkelstein
92d461247e split: extend Strategy::Number to add NumberType
Make the `Strategy::Number` enumeration value more general by
replacing the number parameter with a `NumberType` enum parameter.
This allows a future commit to update `split` to support the various
sub-strategies for the `-n`. (This commit does not add support for the
other sub-strategies.)
2022-02-22 18:41:29 -05:00
Omer Tuchfeld
0ce22f3a08 Improve coverage / error messages from parse_size PR
https://github.com/uutils/coreutils/pull/3084 (2a333ab391) had some
missing coverage and was merged before I had a chance to fix it.

This PR adds some coverage / improved error messages that were missing
from that previous PR.
2022-02-22 22:09:45 +01:00
Gilad Naaman
159a1dc1db Fix type-error when calling parse_size from split 2022-02-22 13:49:20 +01:00
Terts Diepraam
53070141c1
all: add format_usage function (#3139)
This should correct the usage strings in both the `--help` and user documentation. Previously, sometimes the name of the utils did not show up correctly.
2022-02-21 17:14:03 +01:00
Terts Diepraam
938c5acbbe
Merge pull request #3146 from ndd7xv/split-suffix-check
split: error when num. of chunks is greater than num. of possible filenames
2022-02-20 17:12:05 +01:00
Jeffrey Finkelstein
6718d97f97 split: add support for -e argument
Add the `-e` flag, which indicates whether to elide (that is, remove)
empty files that would have been created by the `-n` option.

The `-n` command-line argument gives a specific number of chunks into
which the input files will be split. If the number of chunks is
greater than the number of bytes, then empty files will be created for
the excess chunks. But if `-e` is given, then empty files will not be
created.

For example, contrast

    $ printf 'a\n' > f && split -e -n 3 f && cat xaa xab xac
    a
    cat: xac: No such file or directory

with

    $ printf 'a\n' > f && split -n 3 f && cat xaa xab xac
    a
2022-02-17 19:03:51 -05:00
Terts Diepraam
e1a611374a
Merge pull request #2981 from jfinkels/split-hex-numbers
split: add support for -x option (hex suffixes)
2022-02-17 23:20:58 +01:00
ndd7xv
494d709e0f split: small tweaks to wording
changes `SuffixType` enums to have better names and hex suffix help to be consistent with numeric suffix help
2022-02-17 01:04:26 -05:00
ndd7xv
6c3fc7b214 split: throw error when # chunks > # filenames from suffix length 2022-02-16 23:53:56 -05:00
Jeffrey Finkelstein
4470430c89 split: add support for -x option (hex suffixes)
Add support for the `-x` command-line option to `split`. This option
causes `split` to produce filenames with hexadecimal suffixes instead
of the default alphabetic suffixes.
2022-02-16 23:53:56 -05:00
Jeffrey Finkelstein
891c5d1ffa split: add SuffixType::NumericHexadecimal
Add a `NumericHexadecimal` member to the `SuffixType` enum so that a
future commit can add support for hexadecimal filename suffixes to the
`split` program.
2022-02-16 23:53:56 -05:00
Jeffrey Finkelstein
aa4c5aea50 split: refactor to add SuffixType enum
Refactor the code to use a `SuffixType` enumeration with two members,
`Alphabetic` and `NumericDecimal`, representing the two currently
supported ways of producing filename suffixes. This prepares the code
to more easily support other formats, like numeric hexadecimal.
2022-02-16 23:53:56 -05:00
DevSabb
6d6371741a
include io-blksize parameter (#3064)
* include io-blksize parameter

* format changes for including io-blksize

Co-authored-by: DevSabb <devsabb@local>
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
2022-02-14 19:47:18 +01:00
Jeffrey Finkelstein
a4955b4e06 split: add support for -x option (hex suffixes)
Add support for the `-x` command-line option to `split`. This option
causes `split` to produce filenames with hexadecimal suffixes instead
of the default alphabetic suffixes.
2022-02-13 11:18:37 -05:00
Jeffrey Finkelstein
494dc7ec57 split: add SuffixType::NumericHexadecimal
Add a `NumericHexadecimal` member to the `SuffixType` enum so that a
future commit can add support for hexadecimal filename suffixes to the
`split` program.
2022-02-13 11:18:37 -05:00
Jeffrey Finkelstein
7fbd805713 split: refactor to add SuffixType enum
Refactor the code to use a `SuffixType` enumeration with two members,
`Alphabetic` and `NumericDecimal`, representing the two currently
supported ways of producing filename suffixes. This prepares the code
to more easily support other formats, like numeric hexadecimal.
2022-02-13 11:18:37 -05:00
Sylvestre Ledru
6b6d5ee7db
Merge pull request #2827 from jfinkels/split-std-io-copy
split: use std::io::copy() with new writer implementation to improve maintainability and speed
2022-02-12 11:33:12 +01:00
Jeffrey Finkelstein
2f65b29866 split: error when --additional-suffix contains /
Make `split` terminate with a usage error when the
`--additional-suffix` argument contains a directory separator
character.
2022-02-10 19:33:33 -05:00
Jeffrey Finkelstein
b37718de10 split: add BENCHMARKING.md documentation file 2022-02-08 22:58:00 -05:00
Jeffrey Finkelstein
70ca1f45ea split: remove unused ByteSplitter and LineSplitter 2022-02-08 22:58:00 -05:00
Jeffrey Finkelstein
1d7e1b8732 split: use ByteChunkWriter and LineChunkWriter
Replace `ByteSplitter` and `LineSplitter` with `ByteChunkWriter` and
`LineChunkWriter` respectively. This results in a more maintainable
design and an increase in the speed of splitting by lines.
2022-02-08 22:57:57 -05:00
Jeffrey Finkelstein
b31d63eaa9 split: add ByteChunkWriter and LineChunkWriter
Add the `ByteChunkWriter` and `LineChunkWriter` structs and
implementations, but don't use them yet. This structs offer an
alternative approach to writing chunks of output (contrasted with
`ByteSplitter` and `LineSplitter`). The main difference is that
control of which underlying file is being written is inside the writer
instead of outside.
2022-02-08 22:53:56 -05:00
Jeffrey Finkelstein
8fa6797255 split: add structure to errors that can be created
Add some structure to errors that can be created during parsing of
settings from command-line options. This commit creates
`StrategyError` and `SettingsError` enumerations to represent the
various parsing and other errors that can arise when transforming
`ArgMatches` into `Settings`.
2022-02-06 20:09:29 -05:00
Jeffrey Finkelstein
e5361a8c11 split: correct error message on invalid arg. to -a
Correct the error message displayed on an invalid parameter to the
`--suffix-length` or `-a` command-line option.
2022-02-06 20:09:29 -05:00
Daniel Eades
4f8d1c5fcf add additional lints 2022-01-31 20:40:47 +01:00
Terts Diepraam
7b3cfcf708
Merge pull request #2868 from jfinkels/split-filename-iterator
split: use iterator to produce filenames
2022-01-30 22:37:37 +01:00
Jeffrey Finkelstein
a5b435da58 split: use iterator to produce filenames
Replace the `FilenameFactory` with `FilenameIterator` and calls to
`FilenameFactory::make()` with calls to `FilenameIterator::next()`. We
did not need the fully generality of being able to produce the
filename for an arbitrary chunk index. Instead we need only iterate
over filenames one after another. This allows for a less
mathematically dense algorithm that is easier to understand and
maintain. Furthermore, it can be connected to some familiar concepts
from the representation of numbers as a sequence of digits.

This does not change the behavior of the `split` program, just the
implementation of how filenames are produced.

Co-authored-by: Terts Diepraam <terts.diepraam@gmail.com>
2022-01-30 11:18:58 -05:00
Daniel Eades
ba45fe312a use 'Self' and derive 'Default' where possible 2022-01-30 15:08:26 +01:00
Daniel Eades
784f2e2ea1 use semicolons if nothing returned 2022-01-30 15:08:26 +01:00
Daniel Eades
a2d5f06be4 remove needless pass by value 2022-01-30 15:08:26 +01:00
Sylvestre Ledru
57dc11e586
Merge pull request #2871 from jfinkels/split-settings-methods
split: add a method to convert ArgMatches to Settings
2022-01-30 11:31:58 +01:00
Sylvestre Ledru
7c1abdb7d9
Merge pull request #2866 from jfinkels/split-number-2
split: implement -n option
2022-01-30 09:58:04 +01:00
Terts Diepraam
eb82015b23 all: change macros
- Change the main! proc_macro to a bin! macro_rules macro.
- Reexport uucore_procs from uucore
- Make utils to not import uucore_procs directly
- Remove the `syn` dependency and don't parse proc_macro input (hopefully for faster compile times)
2022-01-29 15:26:32 +01:00
Terts Diepraam
9c8e865b55 all: enable infer long arguments in clap 2022-01-29 02:06:29 +01:00
Jeffrey Finkelstein
b636ff04a0 split: implement -n option
Implement the `-n` command-line option to `split`, which splits a file
into a specified number of chunks by byte.
2022-01-27 21:16:27 -05:00
Terts Diepraam
55a47f6fc0
Merge pull request #2863 from tertsdiepraam/clap-3
Clap 3
2022-01-20 23:14:52 +01:00
Roy Ivy III
2e251f91f1 0.0.12 2022-01-19 05:35:00 -06:00
Jeffrey Finkelstein
58f2000406 split: method to convert ArgMatches to Settings
Create a `Settings::from` method that converts a `clap::ArgMatches`
instance into a `Settings` instance. This eliminates the unnecessary
use of a mutable variable when initializing the settings.
2022-01-17 08:58:10 -05:00
Terts Diepraam
8872485922 Merge branch 'main' into clap-3 2022-01-17 13:25:51 +01:00
Sylvestre Ledru
516bdfcfd5
Merge pull request #2872 from jfinkels/split-verbose
split: add --verbose option
2022-01-16 23:19:30 +01:00
Sylvestre Ledru
1fbda8003c coreutils 0.0.8 => 0.0.9, uucore_procs 0.0.7 => 0.0.8, uucore 0.0.10 => 0.0.11 2022-01-16 17:05:48 +01:00
Jeffrey Finkelstein
7af3007204 split: add --verbose option 2022-01-16 09:34:28 -05:00
Terts Diepraam
ecf6f18ab3 split: clap 3 2022-01-11 19:16:48 +01:00
Jeffrey Finkelstein
cfe5a0d82c split: correct filename creation algorithm
Fix two issues with the filename creation algorithm. First, this
corrects the behavior of the `-a` option. This commit ensures a
failure occurs when the number of chunks exceeds the number of
filenames representable with the specified fixed width:

    $ printf "%0.sa" {1..11} | split -d -b 1 -a 1
    split: output file suffixes exhausted

Second, this corrects the behavior of the default behavior when `-a`
is not specified on the command line. Previously, it was always
settings the filenames to have length 2 suffixes. This commit corrects
the behavior to follow the algorithm implied by GNU split, where the
filename lengths grow dynamically by two characters once the number of
chunks grows sufficiently large:

    $ printf "%0.sa" {1..91} | ./target/debug/coreutils split -d -b 1 \
    >   && ls x* | tail
    x81
    x82
    x83
    x84
    x85
    x86
    x87
    x88
    x89
    x9000
2022-01-10 20:43:22 -05:00
Jeffrey Finkelstein
e5d6b7a1cf split: correct arg parameters for -b option 2022-01-10 20:43:22 -05:00
Jeffrey Finkelstein
1f937b0760 split: return UResult from uumain() function 2021-12-31 12:19:36 -05:00
Jeffrey Finkelstein
8f04613a84 split: create Strategy enum for chunking strategy 2021-12-30 22:18:17 -05:00
Jeffrey Finkelstein
25d0ccc61d split: move parsing outside of *Splitter::new()
Move the parsing of the output chunk size from inside
`ByteSplitter::new()` and `LineSplitter::new()` to outside. This
eliminates duplicate code and reduces the responsibilities of the
`ByteSplitter` and `LineSplitter` implementations.
2021-12-30 22:17:26 -05:00
Jeffrey Finkelstein
75e742a008 split: correct help text for -l option 2021-12-30 22:17:20 -05:00
Roy Ivy III
f20aa49821 maint/CICD ~ (GHA) fix cargo-udeps false positives (add 'ignore' exceptions to sub-crates) 2021-11-19 17:55:02 -06:00
Sylvestre Ledru
59e9870c56 Prepare version 0.0.8 2021-10-23 19:21:50 +02:00
Sylvestre Ledru
7eaae75bfc add a github action job to identify unused deps 2021-09-15 12:06:50 +02:00
Jan Verbeek
259f18fcab Update message quoting and filename printing 2021-09-07 19:49:01 +02:00
Jan Verbeek
acfd1ebe57 fixup! Run clippy on the full workspace 2021-08-24 17:28:10 +02:00
Michael Debertol
252220e9eb refactor/uucore ~ make util_name and execution_phrase functions
Since util_name and execution_phrase no longer rely on features that are
only available to macros, they may as well be plain functions.
2021-08-14 17:55:18 +02:00
Roy Ivy III
c0854000d1 refactor ~ use execution_phrase!() for usage messaging 2021-08-14 14:01:33 +02:00
Roy Ivy III
23b68d80ba refactor ~ usage() instead of get_usage() 2021-08-14 13:58:43 +02:00
Roy Ivy III
c5792c2a0f refactor ~ use util_name!() as clap::app::App name argument for all utils 2021-08-14 13:53:13 +02:00
Sylvestre Ledru
26a882551b update the dep to uucore_procs 0.0.6 2021-07-11 21:04:11 +02:00
Sylvestre Ledru
1d8a66b7d3 Update to version 0.0.7 2021-07-11 18:04:56 +02:00
Michael Debertol
2ebca384c6 all utils: enable wrap_help
This makes clap wrap the help text according to the terminal width,
which improves readability for terminal widths < 120 chars,
because clap defaults to a width of 120 chars without this feature.
2021-06-27 16:17:10 +02:00
Michael Debertol
0531153fa6 uutils: move clap::App creation to separate functions 2021-06-25 21:23:45 +02:00
Jan Scheer
c0be979611 fix some issues with locale (replace "LANGUAGE" with "LC_ALL")
`LANGUAGE=C` is not enough, `LC_ALL=C` is needed as the environment
variable that overrides all the other localization settings.

e.g.
```bash
$ LANGUAGE=C id foobar
id: ‘foobar’: no such user

$ LC_ALL=C id foobar
id: 'foobar': no such user
```

* replace `LANGUAGE` with `LC_ALL` as environment variable in the tests
* fix the the date string of affected uutils
* replace `‘` and `’` with `'`
2021-06-23 11:30:28 +02:00
Jan Scheer
1b824f4914 fix clippy warnings 2021-06-09 15:56:29 +02:00
Jan Scheer
be8650278b Merge branch 'master' into refactoring_parse_size 2021-06-09 13:44:40 +02:00
Roy Ivy III
79a33728ca refactor/split ~ fix cargo clippy complaint (clippy::needless_borrow) 2021-06-06 19:28:24 -05:00
Jan Scheer
130bf49e5d Merge branch 'master' of github.com:uutils/coreutils into refactoring_parse_size 2021-06-03 22:32:34 +02:00
Jan Scheer
ad26b7a042 head/tail/split: make error handling of NUM/SIZE arguments more
consistent

* add tests for each flag that takes NUM/SIZE arguments
* fix bug in tail where 'quiet' and 'verbose' flags did not override each other POSIX style
2021-06-03 20:37:29 +02:00