coreutils

mirror of https://github.com/uutils/coreutils synced 2024-12-13 23:02:38 +00:00

Author	SHA1	Message	Date
Jeffrey Finkelstein	77d92883c7	split: implement --line-bytes option Implement the `--line-bytes` option to `split`. In this mode, the program tries to write as many lines of the input as possible to each chunk of output without exceeding a specified byte limit. The new `LineBytesChunkWriter` struct represents this functionality.	2022-03-10 22:51:49 -05:00
Jeffrey Finkelstein	ee36dea1a9	split: implement outputting kth chunk of file Implement `-n l/k/N` option, where the `k`th chunk of the input file is written to stdout. For example, $ seq -w 0 99 > f; split -n l/3/10 f 20 21 22 23 24 25 26 27 28 29	2022-03-05 10:27:51 +01:00
Jeffrey Finkelstein	6718d97f97	split: add support for -e argument Add the `-e` flag, which indicates whether to elide (that is, remove) empty files that would have been created by the `-n` option. The `-n` command-line argument gives a specific number of chunks into which the input files will be split. If the number of chunks is greater than the number of bytes, then empty files will be created for the excess chunks. But if `-e` is given, then empty files will not be created. For example, contrast $ printf 'a\n' > f && split -e -n 3 f && cat xaa xab xac a cat: xac: No such file or directory with $ printf 'a\n' > f && split -n 3 f && cat xaa xab xac a	2022-02-17 19:03:51 -05:00
Terts Diepraam	e1a611374a	Merge pull request #2981 from jfinkels/split-hex-numbers split: add support for -x option (hex suffixes)	2022-02-17 23:20:58 +01:00
Jeffrey Finkelstein	ba1ce7179b	dd: move unit tests into dd.rs and test_dd.rs Clean up unit tests in the `dd` crate to make them easier to manage. This commit does a few things. * move test cases that test the complete functionality of the `dd` program from the `dd_unit_tests` module up to the `tests/by-util/test_dd.rs` module so that they can take advantage of the testing framework and common testing tools provided by uutils, * move test cases that test internal functions of the `dd` implementation into the `tests` module within `dd.rs` so that they live closer to the code they are testing, * replace test cases defined by macros with test cases defined by plain old functions to make the test cases easier to read at a glance.	2022-02-15 21:50:48 -05:00
Jeffrey Finkelstein	a4955b4e06	split: add support for -x option (hex suffixes) Add support for the `-x` command-line option to `split`. This option causes `split` to produce filenames with hexadecimal suffixes instead of the default alphabetic suffixes.	2022-02-13 11:18:37 -05:00
Sylvestre Ledru	f9e04ae5ef	Merge pull request #2966 from allan-silva/wc-files0-from-opt wc: implement files0-from option	2022-02-12 19:05:05 +01:00
Sylvestre Ledru	6b6d5ee7db	Merge pull request #2827 from jfinkels/split-std-io-copy split: use std::io::copy() with new writer implementation to improve maintainability and speed	2022-02-12 11:33:12 +01:00
Shreyans Jain	3176ad5c1b	tests/hashsum: Fix missing space in checkfile	2022-02-10 13:55:53 +05:30
Shreyans Jain	30d7a4b167	hashsum: Add BLAKE3 to Hashing Algorithms Signed-off-by: Shreyans Jain <shreyansthebest2007@gmail.com>	2022-02-10 12:46:44 +05:30
Jeffrey Finkelstein	1d7e1b8732	split: use ByteChunkWriter and LineChunkWriter Replace `ByteSplitter` and `LineSplitter` with `ByteChunkWriter` and `LineChunkWriter` respectively. This results in a more maintainable design and an increase in the speed of splitting by lines.	2022-02-08 22:57:57 -05:00
Jeffrey Finkelstein	ca7af808d5	tests: correct a test case for split Correct the `test_split::test_suffixes_exhausted` test case so that it actually exercises the intended behavior of `split`. Previously, the test fixture contained 26 bytes. After this commit, the test fixture contains 27 bytes. When using a suffix width of one, only 26 filenames should be available when naming chunk files---one for each lowercase ASCII letter. This commit ensures that the filenames will be exhausted as intended by the test.	2022-02-08 22:53:57 -05:00
Allan Silva	6a6875012e	wc: implement files0-from option When this option is present, the files argument is not processed. This option processes the file list from provided file, splitting them by the ascii NUL (\0) character. When files0-from is '-', the file list is processed from stdin.	2022-02-04 10:12:08 -03:00
Terts Diepraam	7fc82cd376	Merge pull request #2902 from jtracey/join-non-unicode-sep join: add support for non-unicode field separators	2022-01-31 21:54:56 +01:00
Terts Diepraam	7477761428	Merge pull request #2882 from jtracey/join-bigfields-compat join: "support" field numbers larger than usize::MAX	2022-01-31 21:52:13 +01:00
Justin Tracey	58d65fb953	join: add support for non-unicode field separators This allows for `-t` to take invalid unicode (but still single-byte) values on unix-like platforms. Other platforms, which as of the time of this commit do not support `OsStr::as_bytes()`, could possibly be supported in the future, but would require design decisions as to what that means.	2022-01-30 20:04:22 -05:00
Sylvestre Ledru	7c1abdb7d9	Merge pull request #2866 from jfinkels/split-number-2 split: implement -n option	2022-01-30 09:58:04 +01:00
Sylvestre Ledru	52ab6325a0	Merge pull request #2881 from jtracey/join-null-field-sep join: add support for `-t '\0'`	2022-01-29 10:55:04 +01:00
Jeffrey Finkelstein	b636ff04a0	split: implement -n option Implement the `-n` command-line option to `split`, which splits a file into a specified number of chunks by byte.	2022-01-27 21:16:27 -05:00
Cecylia Bocovich	c8f9ea5b15	tests/join: test default check order behaviour	2022-01-22 17:51:29 -05:00
Justin Tracey	ce3df12eaa	join: "support" field numbers larger than usize::MAX They silently get folded to usize::MAX, which is the official GNU behavior.	2022-01-17 17:49:41 -05:00
Terts Diepraam	08efa1fe5a	Merge branch 'main' into join-null-field-sep	2022-01-17 12:59:52 +01:00
Justin Tracey	109277d405	join: add support for `-t '\0'`	2022-01-16 18:05:58 -05:00
Justin Tracey	346415e1d2	join: add support for -z option	2022-01-16 17:56:07 -05:00
Sylvestre Ledru	00c11b184f	Merge pull request #2851 from jtracey/join-strless join: operate on bytes instead of Strings	2022-01-16 16:24:38 +01:00
Jeffrey Finkelstein	cfe5a0d82c	split: correct filename creation algorithm Fix two issues with the filename creation algorithm. First, this corrects the behavior of the `-a` option. This commit ensures a failure occurs when the number of chunks exceeds the number of filenames representable with the specified fixed width: $ printf "%0.sa" {1..11} \| split -d -b 1 -a 1 split: output file suffixes exhausted Second, this corrects the behavior of the default behavior when `-a` is not specified on the command line. Previously, it was always settings the filenames to have length 2 suffixes. This commit corrects the behavior to follow the algorithm implied by GNU split, where the filename lengths grow dynamically by two characters once the number of chunks grows sufficiently large: $ printf "%0.sa" {1..91} \| ./target/debug/coreutils split -d -b 1 \ > && ls x* \| tail x81 x82 x83 x84 x85 x86 x87 x88 x89 x9000	2022-01-10 20:43:22 -05:00
Justin Tracey	4df2f3c148	join: add test for non-Unicode files	2022-01-08 21:28:29 -05:00
Justin Tracey	cdfe64369d	join: add test for non-linefeed newline characters	2022-01-08 19:51:16 -05:00
Jan Scheer	94cc966535	tail: change notify backend on macOS from `FSEvents` to `kqueue` On macOS only `kqueue` is suitable for our use case because `FSEvents` waits until file close to delivers modify events.	2021-10-01 21:33:30 +02:00
Jan Scheer	5615ba9fe1	test_tail: add tests for `--follow=name`	2021-09-27 23:18:00 +02:00
Jan Verbeek	6f7d740592	wc: Do a chunked read with proper UTF-8 handling This brings the results mostly in line with GNU wc and solves nasty behavior with long lines.	2021-08-26 01:38:16 +02:00
jfinkels	bdc0f4b7c3	hashsum: support --check for algorithms with variable output length (#2583 ) * hashsum: support --check for var. length outputs Add the ability for `hashsum --check` to work with algorithms with variable output length. Previously, the program would terminate with an error due to constructing an invalid regular expression. * fixup! hashsum: support --check for var. length outputs	2021-08-23 18:35:19 +02:00
jfinkels	4ef35d4a96	tac: correct behavior of -b option (#2523 ) * tac: correct behavior of -b option Correct the behavior of `tac -b` to match that of GNU coreutils `tac`. Specifically, this changes `tac -b` to assume leading line separators instead of the default trailing line separators. Before this commit, the (incorrect) behavior was $ printf "/abc/def" \| tac -b -s "/" def/abc/ After this commit, the behavior is $ printf "/abc/def" \| tac -b -s "/" /def/abc Fixes #2262. * fixup! tac: correct behavior of -b option * fixup! tac: correct behavior of -b option Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>	2021-08-22 21:01:17 +02:00
Justin Tracey	1bb0237281	join: add support for full outer joins	2021-08-12 23:52:35 -04:00
Tyler	601c9fc620	Merge branch 'master' of https://github.com/uutils/coreutils into uutils-master-2	2021-08-03 17:33:43 -07:00
Michael Debertol	418f5b7692	sort: handle empty merge inputs	2021-07-31 21:02:20 +02:00
Tyler	076ff32e85	Removes project-specific cspell files.	2021-07-23 14:53:24 -07:00
Tyler	885a875552	Addresses build errors - Adds words to cspell exceptions - Converts test macros to use Default trait. - Converts parser to use Default trait. - Adds Windows-friendly test files for block/unblock when nl is present in test/spec file.	2021-07-22 16:04:35 -07:00
Tyler	88363858d5	Minor changes.	2021-07-12 10:13:47 -07:00
Tyler	2e9e984b3a	Adds test with unicode filename Filenames should be the only spot where unicode can appear in dd's cli.	2021-07-06 18:17:30 -07:00
backwaterred	9c38583c6b	Merge pull request #2 from uutils/master catchup with uutils main	2021-07-02 11:34:22 -07:00
Tyler	92281585a7	Merge branch 'master' of https://github.com/uutils/coreutils into uutils-master	2021-07-01 14:33:30 -07:00
Tyler	17cfba41cc	Implements project testfing from root. - conv=FLAG testing. (1) WIP conv=nocreat - iflag & oflag testing. - conv=CONV ascii,...,ucase,...,block,...sync tests at unit-test-level (project root is todo)	2021-06-30 14:47:48 -07:00
Michael Debertol	233a778963	sort/ls: implement version cmp matching GNU spec This reimplements version_cmp, which is used in sort and ls to sort according to versions. However, it is not bug-for-bug identical with GNU's implementation. I reported a bug with GNU here: https://lists.gnu.org/archive/html/bug-coreutils/2021-06/msg00045.html This implementation does not contain the bugs regarding the handling of file extensions and null bytes.	2021-06-27 15:29:17 +02:00
Michael Debertol	548a895cd6	sort: compatibility of human-numeric sort Closes #1985. This makes human-numeric sort follow the same algorithm as GNU's/FreeBSD's sort. As documented by GNU in https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html, we first compare by sign, then by si unit and finally by the numeric value.	2021-06-25 18:19:00 +02:00
Syukron Rifail M	bc8415c9db	du: add --dereference	2021-06-17 14:06:41 +07:00
Terts Diepraam	8afc923796	Merge pull request #2237 from wfscheper/wfscheper/issue2118 chgrp: replace getopts with clap (#2118)	2021-06-12 11:20:24 +02:00
Walter Scheper	cff75f242a	chgrp: replace getopts with clap (#2118 )	2021-06-10 16:38:44 -04:00
Michael Debertol	66359a0f56	sort: insert line separators after non-empty files If files don't end witht a line separator we have to insert one, otherwise the last line will be combined with the first line of the next file.	2021-06-06 18:01:08 +02:00
Michael Debertol	7ffc7d073c	cp: test that file descriptors are closed	2021-06-02 19:21:16 +02:00
Michael Debertol	06b3092f5f	sort: fix debug output for zeros / invalid numbers We were reporting "no match" when sorting something like "0 ". This is because we don't distinguish between 0 and invalid lines when sorting. For debug output we have to get this information back.	2021-06-01 18:18:51 +02:00
Sylvestre Ledru	badf7aacb7	Merge pull request #2300 from tertsdiepraam/pr Implement `pr` (resurrection of the resurrected PR)	2021-05-31 21:14:57 +02:00
Roy Ivy III	4e20dedf58	tests ~ refactor/polish spelling (comments, names, and exceptions)	2021-05-31 08:23:57 -05:00
Terts Diepraam	7690dc018f	Merge branch 'master' into pr	2021-05-31 15:23:06 +02:00
Michael Debertol	dc63133f14	sort: correctly inherit global flags for keys (#2302 ) Closes #2254. We should only inherit global settings for keys when there are absolutely no options attached to the key. The default key (matching the whole line) is implicitly added only if no keys are supplied. Improved some error messages by including more context.	2021-05-29 23:25:56 +02:00
Terts Diepraam	bc1870c0a7	Merge branch 'master' into pr	2021-05-29 19:21:31 +02:00
Michael Debertol	e9656a6c32	sort: make GNU test sort-debug-keys pass (#2269 ) * sort: disable support for thousand separators In order to be compatible with GNU, we have to disable thousands separators. GNU does not enable them for the C locale, either. Once we add support for locales we can add this feature back. * sort: delete unused fixtures * sort: compare -0 and 0 equal I must have misunderstood this when implementing, but GNU considers -0, 0, and invalid numbers to be equal. * sort: strip blanks before applying the char index * sort: don't crash when key start is after key end * sort: add "no match" for months at the first non-whitespace char We should put the "^ no match for key" indicator at the first non-whitespace character of a field. * sort: improve support for e notation * sort: use maches! macros	2021-05-28 22:38:29 +02:00
Jeffrey Finkelstein	659bf58a4c	head: print headings when reading multiple files Fix a bug in which `head` failed to print headings for `stdin` inputs when reading from multiple files, and fix another bug in which `head` failed to print a blank line between the contents of a file and the heading for the next file when reading multiple files. The output now matches that of GNU `head`.	2021-05-16 12:03:10 -04:00
Michael Debertol	e0ebf907a4	sort: make merging stable When merging files we need to prioritize files that occur earlier in the command line arguments with -m. This also makes the extsort merge step (and thus extsort itself) stable again.	2021-05-09 11:43:38 +02:00
Jeffrey Finkelstein	0cafe2b70d	wc: add tests for edge cases for wc on files	2021-05-03 21:07:32 -04:00
Daniel Rocco	c3912d53ac	test: add tests for basic tests & edge cases Some edge cases covered: - no args - operator by itself (!, -a, etc.) - string, file tests of nothing - compound negations	2021-05-01 22:40:47 -04:00
Ricardo Iglesias	05b20c32a9	base64: Moved argument parsing to clap. Moved argument parsing to clap and added tests to cover using "-" as stdin, passing in too many file arguments, and updated the "wrap" error message in the tests.	2021-05-01 11:36:46 -07:00
Sylvestre Ledru	a37e3181a2	Merge pull request #2130 from electricboogie/master sort: implement --buffer-size and --temporary-directory (external sort)	2021-04-28 09:21:14 +02:00
Ricardo Iglesias	d56462a4b3	base32: Fixed style violations. Added tests Tests now cover using "-" as standard input and reading from a file.	2021-04-26 08:00:55 -07:00
electricboogie	4c395146dd	Merge branch 'master' of https://github.com/uutils/coreutils	2021-04-25 10:11:27 -05:00
Michael Debertol	e6f6b109a5	sort: implement --debug This adds a --debug flag, which, when activated, will draw lines below the characters that are actually used for comparisons. This is not a complete implementation of --debug. It should, quoting the man page for GNU sort: "annotate the part of the line used to sort, and warn about questionable usage to stderr". Warning about "questionable usage" is not part of this patch. This change required some adjustments to be able to get the range that is actually used for comparisons. Most notably, general numeric comparisons were rewritten, fixing some bugs along the lines. Testing is mostly done by adding fixtures for the expected debug output of existing tests.	2021-04-23 22:36:15 +02:00
electricboogie	25021f31eb	Incorporate overhead of Line struct	2021-04-19 21:24:52 -05:00
Michael Debertol	4bbbe3a3f2	sort: implement numeric string comparison (#2070 ) * sort: implement numeric string comparison This implements -n and -h using a string comparison algorithm instead of parsing each number to a f64 and comparing those. This should result in a moderate performance increase and eliminate loss of precision. * cache parsed f64 numbers For general numeric comparisons we have to parse numbers as f64, as this behavior is explicitly documented by GNU coreutils. We can however cache the parsed value to speed up comparisons. * fix leading zeroes for negative numbers * use more appropriate name for exponent * improvements to the parse function * move checks into main loop and fix thousands separator condition * remove unneeded checks * rustfmt	2021-04-17 13:49:35 +02:00
electricboogie	a76d452f75	Sort: More small fixes (#2065 ) * Various fixes and performance improvements * fix a typo Co-authored-by: Michael Debertol <michael.debertol@gmail.com> * Fix month parse for months with leading whitespace * Implement test for months whitespace fix * Confirm human numeric works as expected with whitespace with a test * Correct arg help value name for --parallel * Fix SemVer non version lines/empty line sorting with a test Co-authored-by: Sylvestre Ledru <sledru@mozilla.com> Co-authored-by: Michael Debertol <michael.debertol@gmail.com>	2021-04-17 10:06:19 +02:00
Árni Dagur	eb4971e6f4	cat: Unrevert splice patch (#2020 ) * cat: Unrevert splice patch * cat: Add fifo test * cat: Add tests for error cases * cat: Add tests for character devices * wc: Make sure we handle short splice writes * cat: Fix tests for 1.40.0 compiler * cat: Run rustfmt on test_cat.rs * Run 'cargo +1.40.0 update'	2021-04-10 22:19:53 +02:00
Sylvestre Ledru	bf1944271c	remove .DS_Store	2021-04-10 21:57:03 +02:00
Michael Debertol	69f4410a8a	sort: dedup using compare_by (#2064 ) compare_by is the function used for sorting, we should use it for dedup as well.	2021-04-10 19:49:10 +02:00
electricboogie	e5113ad00e	Sort: Various fixes and performance improvements (#2057 ) * Various fixes and performance improvements * fix a typo Co-authored-by: Michael Debertol <michael.debertol@gmail.com> Co-authored-by: Sylvestre Ledru <sledru@mozilla.com> Co-authored-by: Michael Debertol <michael.debertol@gmail.com>	2021-04-10 11:56:20 +02:00
Sivachandran	ee070028e4	install: implement stripping symbol table (#2047 )	2021-04-10 11:53:29 +02:00
Sylvestre Ledru	844e318a67	Merge branch 'master' into pr	2021-04-09 22:02:25 +02:00
electricboogie	8474249e5f	Sort: Implement stable sort, ignore non-printing, month sort dedup, auto parallel sort through rayon, zero terminated sort, check silent (#2008 )	2021-04-08 22:07:09 +02:00
Yagiz Degirmenci	c965effe07	fold: move to clap, add tests (#2015 )	2021-04-06 22:51:27 +02:00
Sylvestre Ledru	f57eb0fdfa	Merge pull request #1993 from cbjadwani/master uniq: Implement --group option	2021-04-05 22:33:04 +02:00
Yagiz Degirmenci	cbe07c93c6	cksum: add tests and fixtures (#1923 )	2021-04-05 22:21:21 +02:00
Daniel Rocco	e5c61a28be	fold: variable width tabs, guard treating tab as whitespace Treat tab chars as advancing to the next tab stop rather than having a fixed 8-column width. Also treat tab as a whitespace split target only when splitting on word boundaries.	2021-04-05 08:55:07 -04:00
Chirag Jadwani	19c6a42de5	uniq: implement group option	2021-04-04 15:22:17 +05:30
Paul Otten	7859bf885f	Consistency with GNU version of `du` when doing `du -h` on an empty file	2021-04-01 19:42:43 -04:00
Mikadore	8320b1ec5f	Rewrote head (#1911 ) See https://github.com/uutils/coreutils/pull/1911 for the details	2021-03-29 13:08:48 +02:00
jaggededgedjustice	88d0bb01c0	Add shuf tests (#1958 ) * Add tests for shuf * Fixup GNU tests for shuf	2021-03-28 17:52:01 +02:00
Max Semenik	62fe68850e	pr: Fixes after rebasing Only the minimum needed to: * Make everything compile without warnings * Move files according to the new project structure * Make tests pass	2021-03-26 17:57:19 +03:00
tilakpatidar	75b35e6002	pr: remove not required tests	2021-03-26 14:11:15 +03:00
tilakpatidar	054c05d5d8	pr: refactor get_lines_for_printing, write_columns, recreate_arguments pr: extract recreate_arguments pr: refactor get_line_for_printing pr: refactor get_lines_for_printing pr: refactor fetch_indexes generate for write_columns pr: refactor write_columns pr: refactor write_columns	2021-03-26 14:11:15 +03:00
tilakpatidar	40e7f3d900	pr: add -J and -S option pr: add -J option pr: add -S option	2021-03-26 14:11:15 +03:00
tilakpatidar	a4b723233a	pr: add more tests for form feed and include page_width option W	2021-03-26 14:11:15 +03:00
tilakpatidar	3be5dc6923	pr: fix form feed pr: fix form feed pr: Rustfmt pr: add test for ff and -l option	2021-03-26 14:11:15 +03:00
Tilak Patidar	5956894d00	pr: add -m and -o option pr: Add -o option	2021-03-26 14:11:14 +03:00
Tilak Patidar	dd07aed4d1	pr: add column separator option	2021-03-26 14:11:14 +03:00
Tilak Patidar	69371ce3ce	pr: add tests for --column with across option	2021-03-26 14:11:14 +03:00
Tilak Patidar	f497fb9d88	pr: read from stdin	2021-03-26 14:11:14 +03:00
Tilak Patidar	d9084a7399	pr: implement across option and fix tests	2021-03-26 14:11:14 +03:00
Tilak Patidar	f3676573b5	pr: print padded string for each column and handle tab issues pr: Print fixed padded string for each column pr: Fix display length vs str length due to tabs	2021-03-26 14:11:14 +03:00
Tilak Patidar	b578bb6563	pr: add test for -t -l -r option pr: Add test for -l option pr: Add test for -r suppress error option	2021-03-26 14:11:14 +03:00
Tilak Patidar	b742230dbb	pr: fix page ranges pr: Fix page ranges	2021-03-26 14:11:14 +03:00
Tilak Patidar	88ec02a61c	pr: add suport for -n [char][width] and -N pr: Fix long name for -n pr: Add -N first line number option pr: Add -n[char][width] support	2021-03-26 14:11:14 +03:00
Tilak Patidar	afc58eb6ea	pr: add tests for -n -h -d option pr: Add test for -h option pr: Add test for -d option	2021-03-26 14:11:14 +03:00

1 2 3 4 5

216 commits