coreutils

mirror of https://github.com/uutils/coreutils synced 2024-12-15 07:42:48 +00:00

Author	SHA1	Message	Date
Jeffrey Finkelstein	77d92883c7	split: implement --line-bytes option Implement the `--line-bytes` option to `split`. In this mode, the program tries to write as many lines of the input as possible to each chunk of output without exceeding a specified byte limit. The new `LineBytesChunkWriter` struct represents this functionality.	2022-03-10 22:51:49 -05:00
Sylvestre Ledru	f3bd1f3020	Add onehundredlines in the spell ignore	2022-03-05 10:27:51 +01:00
Jeffrey Finkelstein	ee36dea1a9	split: implement outputting kth chunk of file Implement `-n l/k/N` option, where the `k`th chunk of the input file is written to stdout. For example, $ seq -w 0 99 > f; split -n l/3/10 f 20 21 22 23 24 25 26 27 28 29	2022-03-05 10:27:51 +01:00
Sylvestre Ledru	346cfa060b	Merge pull request #2980 from jfinkels/split-lines-2 split: add support for "-n l/NUM" option to split	2022-03-01 10:13:44 +01:00
Jeffrey Finkelstein	dbbee573ab	split: add support for "-n l/NUM" option to split Add support for `split -n l/NUM`. Previously, `split` only supported `-n NUM`, which splits a file into `NUM` chunks by byte. The `-n l/NUM` strategy splits a file into `NUM` chunks without splitting lines across chunks.	2022-02-22 18:44:08 -05:00
Omer Tuchfeld	0ce22f3a08	Improve coverage / error messages from `parse_size` PR https://github.com/uutils/coreutils/pull/3084 (`2a333ab391`) had some missing coverage and was merged before I had a chance to fix it. This PR adds some coverage / improved error messages that were missing from that previous PR.	2022-02-22 22:09:45 +01:00
Omer Tuchfeld	fa60898354	Adjust 32-bit tests for tail,split,truncate,head	2022-02-22 13:49:20 +01:00
Jeffrey Finkelstein	6718d97f97	split: add support for -e argument Add the `-e` flag, which indicates whether to elide (that is, remove) empty files that would have been created by the `-n` option. The `-n` command-line argument gives a specific number of chunks into which the input files will be split. If the number of chunks is greater than the number of bytes, then empty files will be created for the excess chunks. But if `-e` is given, then empty files will not be created. For example, contrast $ printf 'a\n' > f && split -e -n 3 f && cat xaa xab xac a cat: xac: No such file or directory with $ printf 'a\n' > f && split -n 3 f && cat xaa xab xac a	2022-02-17 19:03:51 -05:00
Terts Diepraam	e1a611374a	Merge pull request #2981 from jfinkels/split-hex-numbers split: add support for -x option (hex suffixes)	2022-02-17 23:20:58 +01:00
DevSabb	63fa3c81ed	fix failure in test_split	2022-02-14 20:41:58 -05:00
DevSabb	6d6371741a	include io-blksize parameter (#3064 ) * include io-blksize parameter * format changes for including io-blksize Co-authored-by: DevSabb <devsabb@local> Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>	2022-02-14 19:47:18 +01:00
Jeffrey Finkelstein	a4955b4e06	split: add support for -x option (hex suffixes) Add support for the `-x` command-line option to `split`. This option causes `split` to produce filenames with hexadecimal suffixes instead of the default alphabetic suffixes.	2022-02-13 11:18:37 -05:00
Sylvestre Ledru	6b6d5ee7db	Merge pull request #2827 from jfinkels/split-std-io-copy split: use std::io::copy() with new writer implementation to improve maintainability and speed	2022-02-12 11:33:12 +01:00
Jeffrey Finkelstein	2f65b29866	split: error when --additional-suffix contains / Make `split` terminate with a usage error when the `--additional-suffix` argument contains a directory separator character.	2022-02-10 19:33:33 -05:00
Jeffrey Finkelstein	1d7e1b8732	split: use ByteChunkWriter and LineChunkWriter Replace `ByteSplitter` and `LineSplitter` with `ByteChunkWriter` and `LineChunkWriter` respectively. This results in a more maintainable design and an increase in the speed of splitting by lines.	2022-02-08 22:57:57 -05:00
Jeffrey Finkelstein	ca7af808d5	tests: correct a test case for split Correct the `test_split::test_suffixes_exhausted` test case so that it actually exercises the intended behavior of `split`. Previously, the test fixture contained 26 bytes. After this commit, the test fixture contains 27 bytes. When using a suffix width of one, only 26 filenames should be available when naming chunk files---one for each lowercase ASCII letter. This commit ensures that the filenames will be exhausted as intended by the test.	2022-02-08 22:53:57 -05:00
Jeffrey Finkelstein	e5361a8c11	split: correct error message on invalid arg. to -a Correct the error message displayed on an invalid parameter to the `--suffix-length` or `-a` command-line option.	2022-02-06 20:09:29 -05:00
Daniel Eades	ba45fe312a	use 'Self' and derive 'Default' where possible	2022-01-30 15:08:26 +01:00
Jeffrey Finkelstein	b636ff04a0	split: implement -n option Implement the `-n` command-line option to `split`, which splits a file into a specified number of chunks by byte.	2022-01-27 21:16:27 -05:00
Greg Guthe	771c9f5d9c	tests: update random_chars generator to map u8 to char Fix 'value of type `char` cannot be built from `std::iter::Iterator<Item=u8>`' for split test. refs: https://docs.rs/rand/0.8.4/rand/distributions/struct.Alphanumeric.html#example	2022-01-24 20:40:31 -05:00
Jeffrey Finkelstein	7af3007204	split: add --verbose option	2022-01-16 09:34:28 -05:00
Jeffrey Finkelstein	cfe5a0d82c	split: correct filename creation algorithm Fix two issues with the filename creation algorithm. First, this corrects the behavior of the `-a` option. This commit ensures a failure occurs when the number of chunks exceeds the number of filenames representable with the specified fixed width: $ printf "%0.sa" {1..11} \| split -d -b 1 -a 1 split: output file suffixes exhausted Second, this corrects the behavior of the default behavior when `-a` is not specified on the command line. Previously, it was always settings the filenames to have length 2 suffixes. This commit corrects the behavior to follow the algorithm implied by GNU split, where the filename lengths grow dynamically by two characters once the number of chunks grows sufficiently large: $ printf "%0.sa" {1..91} \| ./target/debug/coreutils split -d -b 1 \ > && ls x* \| tail x81 x82 x83 x84 x85 x86 x87 x88 x89 x9000	2022-01-10 20:43:22 -05:00
Jan Scheer	c0be979611	fix some issues with locale (replace "LANGUAGE" with "LC_ALL") `LANGUAGE=C` is not enough, `LC_ALL=C` is needed as the environment variable that overrides all the other localization settings. e.g. ```bash $ LANGUAGE=C id foobar id: ‘foobar’: no such user $ LC_ALL=C id foobar id: 'foobar': no such user ``` * replace `LANGUAGE` with `LC_ALL` as environment variable in the tests * fix the the date string of affected uutils * replace `‘` and `’` with `'`	2021-06-23 11:30:28 +02:00
Jan Scheer	f8e96150f8	fix clippy warnings and spelling * add some missing LICENSE headers	2021-06-04 15:39:34 +02:00
Jan Scheer	130bf49e5d	Merge branch 'master' of github.com:uutils/coreutils into refactoring_parse_size	2021-06-03 22:32:34 +02:00
Jan Scheer	2f5f7c6fa1	split: use "parse_size" from uucore * make stderr of parsing SIZE/NUMBER argument consistent with GNU's behavior * add error handling * add tests	2021-06-02 21:32:41 +02:00
Roy Ivy III	4e20dedf58	tests ~ refactor/polish spelling (comments, names, and exceptions)	2021-05-31 08:23:57 -05:00
Jan Scheer	3aeccfd802	fix a lot of clippy warnings	2021-05-29 15:11:22 +02:00
Samuel Ainsworth	b8a3a8995f	Fix test_split_bytes_prime_part_size	2021-05-08 14:25:21 +02:00
Samuel Ainsworth	7c1395366e	Fix split's handling of non-UTF-8 files	2021-05-08 14:25:21 +02:00
Felipe Lema	35a7f01d15	Refactor(split) - migrate from getopts to clap (#1712 )	2021-02-11 20:45:23 +01:00
Felipe Lema	88911be6e0	`--filter` argument for `split` (#1681 )	2021-01-18 14:42:44 +01:00
Jens Humrich	bfca334ec1	style issues	2020-09-17 12:40:48 +02:00
Jens Humrich	5a75905476	Add additional-suffix option to split	2020-09-16 17:59:39 +02:00
Roy Ivy III	de0375f909	tests ~ reorganize tests	2020-06-01 18:30:04 -05:00

35 commits