Commit graph

123 commits

Author SHA1 Message Date
Terts Diepraam
975a1d170d change remaining usage codes of 2 to 1 for GNU compat 2022-09-10 20:24:24 +02:00
Daniel Hofstetter
747ed592d9 Replace allow_invalid_utf8() with value_parser() 2022-08-25 15:21:50 +02:00
Terts Diepraam
15180249fc Version 0.0.15 2022-08-20 13:13:22 +02:00
Daniel Hofstetter
62b1b7cfb2 Replace deprecated values_of_os() with get_many() 2022-08-20 08:19:11 +02:00
Niyaz Nigmatullin
9cd898b885 remove nix 0.24.2 dependency 2022-08-17 13:13:27 +03:00
Sylvestre Ledru
9f1219005d fix the significant_drop_in_scrutinee clippy warning 2022-08-10 21:37:48 +02:00
Daniel Hofstetter
7c3116330e Replace deprecated is_present() with contains_id() 2022-08-02 15:21:39 +02:00
Daniel Hofstetter
fc4544c42b bump clap from 3.1.18 to 3.2.15 2022-07-29 14:05:02 +02:00
Owen Anderson
d5f59f23fa Implement wc fast paths that skip Unicode decoding.
Byte, character, and line counting can all be done on the raw bytes
of the incoming stream without decoding the Unicode characters. This
fact was previously exploited in specific fast paths for counting
characters and counting lines. This change unifies those fast paths into
a single shared fast paths, using const generics to specialize the
function for each use case. This has the benefit of making sure that all
combinations of these Unicode-oblivious fast paths benefit from the same
optimization.

On my laptop, this speeds up `wc -clm odyssey1024.txt` from 840ms to
120ms. I experimented with using a filter loop for line counting, but
continuing to use the bytecount crate came out ahead by a significant
margin.
2022-07-23 10:45:26 -07:00
Owen Anderson
417ad0e384 Add rustdoc comment. 2022-07-20 23:32:50 -07:00
Owen Anderson
13762cae05 Implement a fast path for character counting in wc.
When wc is invoked with only the -m flag, we only need to count the
number of Unicode characters in the input. In order to do so, we don't
actually need to decode the input bytes into characters. Rather, we can
simply count the number of non-continuation bytes in the UTF-8 stream,
since every character will contain exactly one non-continuation byte.

On my laptop, this speeds up `wc -m odyssey1024.txt` from 745ms to
109ms.
2022-07-20 22:35:40 -07:00
Andrew Baptist
cc08e1cc3a Update to handle all the latest cargo warnings 2022-07-18 13:20:49 -04:00
Owen Anderson
735db78b3d
wc: specialize scanning loop on settings. (#3708)
* wc: specialize scanning loop on settings.

The primary computational loop in wc (iterating over all the
characters and computing word lengths, etc) is configured by a
number of boolean options that control the text-scanning behavior.
If we monomorphize the code loop for each possible combination of
scanning configurations, the rustc is able to generate better code
for each instantiation, at the least by removing the conditional
checks on each iteration, and possibly by allowing things like
vectorization.

On my computer (aarch64/macos), I am seeing at least a 5% performance
improvement in release builds on all wc flag configurations
(other than those that were already specialized) against
odyssey1024.txt, with wc -l showing the greatest improvement at 15%.

* Reduce the size of the wc dispatch table by half.

By extracting the handling of hand-written fast-paths to the
same dispatch as the automatic specializations, we can avoid
needing to pass `show_bytes` as a const generic to
`word_count_from_reader_specialized`. Eliminating this parameter
halves the number of arms in the dispatch.
2022-07-18 12:16:52 +02:00
dependabot[bot]
d15b95533e
build(deps): bump nix from 0.24.1 to 0.24.2
Bumps [nix](https://github.com/nix-rust/nix) from 0.24.1 to 0.24.2.
- [Release notes](https://github.com/nix-rust/nix/releases)
- [Changelog](https://github.com/nix-rust/nix/blob/v0.24.2/CHANGELOG.md)
- [Commits](https://github.com/nix-rust/nix/compare/v0.24.1...v0.24.2)

---
updated-dependencies:
- dependency-name: nix
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-07-18 06:41:18 +00:00
Gergely Kalas
4f043ff57f Fix 'wc' gnu test-suite compatibility #3678
This change will extract a utility already present in ls to uucore.
This utility is used by dir and vdir too, which are adjusted to
look it up in uucode. No further changes to ls, dir or dirv intended.

The change here largely fiddles with the output of uu_wc to match
that of GNU wc. This is the case to the extent to make unit tests
pass, however, there are differences remaining. One specific
difference I did not tackle is that GNU wc will not align the
output columns (compute_number_width() -> 1) in the specific case
of the input for --files0-from=- being a named pipe, not real stdin.
This difference can be triggered using the following two invocations.
  - wc --files0-from=- < files0 # use a named pipe, GNU does align
  - cat files0- | wc --files0-from=- # use real stdin, GNU does not
    align.
2022-07-01 16:43:09 +02:00
dependabot[bot]
82e81da967 build(deps): bump bytecount from 0.6.2 to 0.6.3
Bumps [bytecount](https://github.com/llogiq/bytecount) from 0.6.2 to 0.6.3.
- [Release notes](https://github.com/llogiq/bytecount/releases)
- [Commits](https://github.com/llogiq/bytecount/commits)

---
updated-dependencies:
- dependency-name: bytecount
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-06 10:07:55 +02:00
Terts Diepraam
eae07adfb1
Version 0.0.14 (#3553)
Version 0.0.14
2022-05-22 19:57:19 +02:00
Terts Diepraam
0acfa07d77 all: add value hints 2022-05-13 16:15:50 +02:00
Ryan Zoeller
363a2a5611 Upgrade nix to 0.24.1, ctrlc to 3.2.2
Limit nix features, which should help compile times slightly.

Replace usage of deprecated nix functionality with std equivalent.
2022-04-25 08:13:20 +02:00
Justin Tracey
1f025c19af address libc weirdness on 32 bit android 2022-04-20 08:44:49 +02:00
Justin Tracey
2a0d58d060 get android builds to compile and pass tests 2022-04-20 08:44:49 +02:00
Terts Diepraam
af9f718936 Change edition to 2021 2022-04-05 10:39:31 +02:00
Terts Diepraam
b7809bd889 version 0.0.13 2022-04-02 11:04:27 +02:00
Ackerley Tng
e9131e2b7f wc: compute number widths using total file sizes
Previously, individual file sizes were used to compute the number width, which
would cause misalignment when the total has a greater number of digits, and is
different from the behavior of GNU wc

```
$ ./target/debug/wc -w -l -m -c -L deny.toml GNUmakefile
  95  422 3110 3110   85 deny.toml
 349  865 6996 6996  196 GNUmakefile
 444 1287 10106 10106  196 total
$ wc -w -l -m -c -L deny.toml GNUmakefile
   95   422  3110  3110    85 deny.toml
  349   865  6996  6996   196 GNUmakefile
  444  1287 10106 10106   196 total
```
2022-03-28 18:56:34 +02:00
Terts Diepraam
20212be4c8 fix clippy errors related to clap upgrade from 3.0.10 to 3.1.6 2022-03-17 22:46:56 +01:00
dependabot[bot]
59440d35c0
build(deps): bump clap from 3.0.10 to 3.1.6
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.10 to 3.1.6.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.10...v3.1.6)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-17 13:06:29 +00:00
Sylvestre Ledru
54a10e955a Update of the cargo.lock url to point to the right branch 2022-03-06 22:13:17 +01:00
Davide Cavalca
19af43222b Include license text in all published crates 2022-03-05 21:21:46 +01:00
Terts Diepraam
53070141c1
all: add format_usage function (#3139)
This should correct the usage strings in both the `--help` and user documentation. Previously, sometimes the name of the utils did not show up correctly.
2022-02-21 17:14:03 +01:00
Allan Silva
e6c94c1cd7 wc: Fix clippy error 2022-02-07 10:20:52 -03:00
Allan Silva
6a6875012e wc: implement files0-from option
When this option is present, the files argument is not processed. This option processes the file list from provided file, splitting them by the ascii NUL (\0) character. When files0-from is '-', the file list is processed from stdin.
2022-02-04 10:12:08 -03:00
Daniel Eades
41e2197188 squash some repeated match blocks 2022-01-30 18:32:09 +01:00
Daniel Eades
ba45fe312a use 'Self' and derive 'Default' where possible 2022-01-30 15:08:26 +01:00
Daniel Eades
784f2e2ea1 use semicolons if nothing returned 2022-01-30 15:08:26 +01:00
Daniel Eades
a2d5f06be4 remove needless pass by value 2022-01-30 15:08:26 +01:00
Terts Diepraam
eb82015b23 all: change macros
- Change the main! proc_macro to a bin! macro_rules macro.
- Reexport uucore_procs from uucore
- Make utils to not import uucore_procs directly
- Remove the `syn` dependency and don't parse proc_macro input (hopefully for faster compile times)
2022-01-29 15:26:32 +01:00
Terts Diepraam
9c8e865b55 all: enable infer long arguments in clap 2022-01-29 02:06:29 +01:00
Sylvestre Ledru
fed5ca4ba9
Merge pull request #2935 from tertsdiepraam/wc-unusual-files
wc: fix counting files from pseudo-filesystem
2022-01-29 01:09:02 +01:00
Sylvestre Ledru
7f79fef2cd fix various doc warnings 2022-01-29 00:09:09 +01:00
Terts Diepraam
dd311b294b wc: fix counting files from pseudo-filesystem 2022-01-28 19:08:44 +01:00
Terts Diepraam
55a47f6fc0
Merge pull request #2863 from tertsdiepraam/clap-3
Clap 3
2022-01-20 23:14:52 +01:00
Roy Ivy III
2e251f91f1 0.0.12 2022-01-19 05:35:00 -06:00
Terts Diepraam
8872485922 Merge branch 'main' into clap-3 2022-01-17 13:25:51 +01:00
Sylvestre Ledru
1fbda8003c coreutils 0.0.8 => 0.0.9, uucore_procs 0.0.7 => 0.0.8, uucore 0.0.10 => 0.0.11 2022-01-16 17:05:48 +01:00
Terts Diepraam
e9e5768591 wc: clap 3 2022-01-11 19:16:48 +01:00
Roy Ivy III
774e72551b change ~ relax 'nix' version and remove 'nix' patch
- code coverage compilation on MacOS latest (MacOS-11+) now works with newer 'nix' versions
2022-01-09 18:57:25 -06:00
Jeffrey Finkelstein
9caf15c44f fixup! wc: return UResult from uumain() function 2022-01-02 19:40:22 -05:00
Jeffrey Finkelstein
e060ac53f2 wc: return UResult from uumain() function 2022-01-02 11:15:30 -05:00
Roy Ivy III
03e0cbb020 update 'nix' within workspace to force patched version 2021-11-19 17:55:03 -06:00
Roy Ivy III
f20aa49821 maint/CICD ~ (GHA) fix cargo-udeps false positives (add 'ignore' exceptions to sub-crates) 2021-11-19 17:55:02 -06:00