2019-09-02 00:52:02 +00:00
|
|
|
# Choose
|
|
|
|
|
2020-06-08 03:12:19 +00:00
|
|
|
This is `choose`, a human-friendly and fast alternative to `cut` and (sometimes) `awk`
|
2020-04-01 20:32:19 +00:00
|
|
|
|
|
|
|
[![`choose` demo](https://asciinema.org/a/315932.png)](https://asciinema.org/a/315932?autoplay=1)
|
|
|
|
|
|
|
|
## Features
|
|
|
|
- terse field selection syntax similar to Python's list slices
|
|
|
|
- negative indexing from end of line
|
|
|
|
- optional start/end index
|
|
|
|
- zero-indexed
|
|
|
|
- reverse ranges
|
|
|
|
- slightly faster than `cut` for sufficiently long inputs, much faster than
|
|
|
|
`awk`
|
|
|
|
- regular expression field separators using Rust's regex syntax
|
2019-09-02 00:52:02 +00:00
|
|
|
|
|
|
|
## Rationale
|
|
|
|
|
|
|
|
The AWK programming language is designed for text processing and is extremely
|
|
|
|
capable in this endeavor. However, the `awk` command is not ideal for rapid
|
|
|
|
shell use, with its requisite quoting of a line wrapped in curly braces, even
|
|
|
|
for the simplest of programs:
|
|
|
|
|
|
|
|
```bash
|
|
|
|
awk '{print $1}'
|
|
|
|
```
|
|
|
|
|
2020-04-01 20:32:19 +00:00
|
|
|
Likewise, `cut` is far from ideal for rapid shell use, because of its confusing
|
|
|
|
syntax. Field separators and ranges are just plain difficult to get right on the
|
|
|
|
first try.
|
2019-09-02 00:52:02 +00:00
|
|
|
|
|
|
|
It is for these reasons that I present to you `choose`. It is not meant to be a
|
|
|
|
drop-in or complete replacement for either of the aforementioned tools, but
|
|
|
|
rather a simple and intuitive tool to reach for when the basics of `awk` or
|
|
|
|
`cut` will do, but the overhead of getting them to behave should not be
|
|
|
|
necessary.
|
|
|
|
|
2020-06-08 15:04:12 +00:00
|
|
|
## Contributing
|
|
|
|
|
|
|
|
Please see our guidelines in [contributing.md](contributing.md).
|
|
|
|
|
2019-09-02 00:52:02 +00:00
|
|
|
## Usage
|
|
|
|
|
2019-09-12 02:57:44 +00:00
|
|
|
```
|
2020-04-01 20:32:19 +00:00
|
|
|
$ choose --help
|
2020-06-08 18:25:15 +00:00
|
|
|
choose 1.1.2
|
2019-09-12 02:57:44 +00:00
|
|
|
`choose` sections from each line of files
|
2019-09-02 00:52:02 +00:00
|
|
|
|
2019-09-12 02:57:44 +00:00
|
|
|
USAGE:
|
2020-06-08 18:25:15 +00:00
|
|
|
choose [FLAGS] [OPTIONS] <choices>...
|
2019-09-02 00:52:02 +00:00
|
|
|
|
2019-09-12 02:57:44 +00:00
|
|
|
FLAGS:
|
2020-06-02 19:09:00 +00:00
|
|
|
-c, --character-wise Choose fields by character number
|
|
|
|
-d, --debug Activate debug mode
|
|
|
|
-x, --exclusive Use exclusive ranges, similar to array indexing in many programming languages
|
|
|
|
-h, --help Prints help information
|
|
|
|
-n, --non-greedy Use non-greedy field separators
|
|
|
|
-V, --version Prints version information
|
2019-09-12 02:57:44 +00:00
|
|
|
|
|
|
|
OPTIONS:
|
2020-06-02 19:09:00 +00:00
|
|
|
-f, --field-separator <field-separator>
|
|
|
|
Specify field separator other than whitespace, using Rust `regex` syntax
|
|
|
|
|
|
|
|
-i, --input <input> Input file
|
|
|
|
-o, --output-field-separator <output-field-separator> Specify output field separator
|
2019-09-12 02:57:44 +00:00
|
|
|
|
|
|
|
ARGS:
|
2020-06-08 18:25:15 +00:00
|
|
|
<choices>... Fields to print. Either a, a:b, a..b, or a..=b, where a and b are integers. The beginning or end
|
|
|
|
of a range can be omitted, resulting in including the beginning or end of the line,
|
|
|
|
respectively. a:b is inclusive of b (unless overridden by -x). a..b is exclusive of b and a..=b
|
|
|
|
is inclusive of b
|
2019-09-02 00:52:02 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
### Examples
|
|
|
|
|
|
|
|
```bash
|
|
|
|
choose 5 # print the 5th item from a line (zero indexed)
|
2019-09-14 02:11:45 +00:00
|
|
|
|
2019-09-02 00:52:02 +00:00
|
|
|
choose -f ':' 0 3 5 # print the 0th, 3rd, and 5th item from a line, where
|
|
|
|
# items are separated by ':' instead of whitespace
|
2019-09-14 02:11:45 +00:00
|
|
|
|
2019-09-02 00:52:02 +00:00
|
|
|
choose 2:5 # print everything from the 2nd to 5th item on the line,
|
2020-04-02 19:25:47 +00:00
|
|
|
# inclusive of the 5th
|
2019-09-14 02:11:45 +00:00
|
|
|
|
2020-04-01 20:32:19 +00:00
|
|
|
choose -x 2:5 # print everything from the 2nd to 5th item on the line,
|
|
|
|
# exclusive of the 5th
|
2019-09-14 02:11:45 +00:00
|
|
|
|
2020-06-08 15:15:21 +00:00
|
|
|
choose :3 # print the beginning of the line to the 3rd item
|
|
|
|
|
|
|
|
choose -x :3 # print the beginning of the line to the 3rd item,
|
2019-09-02 00:52:02 +00:00
|
|
|
# exclusive
|
2019-09-14 02:11:45 +00:00
|
|
|
|
2019-09-02 00:52:02 +00:00
|
|
|
choose 3: # print the third item to the end of the line
|
2020-04-01 20:32:19 +00:00
|
|
|
|
|
|
|
choose -1 # print the last item from a line
|
|
|
|
|
|
|
|
choose -3:-1 # print the last three items from a line
|
2019-09-02 00:52:02 +00:00
|
|
|
```
|
2019-09-17 21:17:43 +00:00
|
|
|
|
|
|
|
## Compilation and Installation
|
|
|
|
|
2020-04-04 18:00:59 +00:00
|
|
|
### Installing From Source
|
|
|
|
|
2020-04-01 20:32:19 +00:00
|
|
|
In order to build `choose` you will need the rust toolchain installed. You can
|
|
|
|
find instructions [here](https://www.rust-lang.org/tools/install).
|
2019-09-17 21:17:43 +00:00
|
|
|
|
|
|
|
Then, to install:
|
|
|
|
|
|
|
|
```bash
|
2020-04-13 17:45:30 +00:00
|
|
|
$ git clone https://github.com/theryangeary/choose.git
|
|
|
|
$ cd choose
|
|
|
|
$ cargo build --release
|
|
|
|
$ install target/release/choose <DESTDIR>
|
2019-09-17 21:17:43 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
Just make sure DESTDIR is in your path.
|
2020-04-01 20:32:19 +00:00
|
|
|
|
2020-04-04 18:00:59 +00:00
|
|
|
### Installing From Package Managers
|
|
|
|
|
2020-06-08 03:12:19 +00:00
|
|
|
Cargo:
|
|
|
|
|
|
|
|
```
|
|
|
|
$ cargo install choose
|
|
|
|
```
|
|
|
|
|
2020-04-13 17:45:30 +00:00
|
|
|
Arch Linux:
|
|
|
|
```
|
|
|
|
$ yay -S choose-rust-git
|
|
|
|
```
|
|
|
|
|
2020-04-04 18:00:59 +00:00
|
|
|
Fedora/CentOS [COPR](https://copr.fedorainfracloud.org/coprs/atim/choose/):
|
2020-04-04 12:38:11 +00:00
|
|
|
|
2020-04-04 18:00:59 +00:00
|
|
|
```
|
2020-04-13 17:45:30 +00:00
|
|
|
$ dnf copr enable atim/choose
|
|
|
|
$ dnf install choose
|
2020-04-04 18:00:59 +00:00
|
|
|
```
|
2020-04-04 12:38:11 +00:00
|
|
|
|
2020-04-01 20:32:19 +00:00
|
|
|
### Benchmarking
|
|
|
|
|
|
|
|
Benchmarking is performed using the [`bench` utility](https://github.com/Gabriel439/bench).
|
|
|
|
|
|
|
|
Benchmarking is based on the assumption that there are five files in `test/`
|
|
|
|
that match the glob "long*txt". GitHub doesn't support files big enough in
|
|
|
|
normal repos, but for reference the files I'm working with have lengths like
|
|
|
|
these:
|
|
|
|
|
|
|
|
```
|
|
|
|
1000 test/long.txt
|
|
|
|
19272 test/long_long.txt
|
|
|
|
96360 test/long_long_long.txt
|
|
|
|
963600 test/long_long_long_long.txt
|
|
|
|
10599600 test/long_long_long_long_long.txt
|
|
|
|
```
|
|
|
|
|
|
|
|
and content generally like this:
|
|
|
|
|
|
|
|
```
|
|
|
|
Those an equal point no years do. Depend warmth fat but her but played. Shy and
|
|
|
|
subjects wondered trifling pleasant. Prudent cordial comfort do no on colonel as
|
|
|
|
assured chicken. Smart mrs day which begin. Snug do sold mr it if such.
|
|
|
|
Terminated uncommonly at at estimating. Man behaviour met moonlight extremity
|
|
|
|
acuteness direction.
|
|
|
|
|
|
|
|
Ignorant branched humanity led now marianne too strongly entrance. Rose to shew
|
|
|
|
bore no ye of paid rent form. Old design are dinner better nearer silent excuse.
|
|
|
|
She which are maids boy sense her shade. Considered reasonable we affronting on
|
|
|
|
expression in. So cordial anxious mr delight. Shot his has must wish from sell
|
|
|
|
```
|
2020-04-04 12:38:11 +00:00
|
|
|
|