choose/readme.md

170 lines
5.3 KiB
Markdown
Raw Normal View History

# Choose
2020-06-08 03:12:19 +00:00
This is `choose`, a human-friendly and fast alternative to `cut` and (sometimes) `awk`
2020-04-01 20:32:19 +00:00
[![`choose` demo](https://asciinema.org/a/315932.png)](https://asciinema.org/a/315932?autoplay=1)
## Features
- terse field selection syntax similar to Python's list slices
- negative indexing from end of line
- optional start/end index
- zero-indexed
- reverse ranges
- slightly faster than `cut` for sufficiently long inputs, much faster than
`awk`
- regular expression field separators using Rust's regex syntax
## Rationale
The AWK programming language is designed for text processing and is extremely
capable in this endeavor. However, the `awk` command is not ideal for rapid
shell use, with its requisite quoting of a line wrapped in curly braces, even
for the simplest of programs:
```bash
awk '{print $1}'
```
2020-04-01 20:32:19 +00:00
Likewise, `cut` is far from ideal for rapid shell use, because of its confusing
syntax. Field separators and ranges are just plain difficult to get right on the
first try.
It is for these reasons that I present to you `choose`. It is not meant to be a
drop-in or complete replacement for either of the aforementioned tools, but
rather a simple and intuitive tool to reach for when the basics of `awk` or
`cut` will do, but the overhead of getting them to behave should not be
necessary.
2020-06-08 15:04:12 +00:00
## Contributing
Please see our guidelines in [contributing.md](contributing.md).
## Usage
2019-09-12 02:57:44 +00:00
```
2020-04-01 20:32:19 +00:00
$ choose --help
2020-06-09 03:23:33 +00:00
choose 1.2.0
2019-09-12 02:57:44 +00:00
`choose` sections from each line of files
2019-09-12 02:57:44 +00:00
USAGE:
choose [FLAGS] [OPTIONS] <choices>...
2019-09-12 02:57:44 +00:00
FLAGS:
2020-06-02 19:09:00 +00:00
-c, --character-wise Choose fields by character number
-d, --debug Activate debug mode
-x, --exclusive Use exclusive ranges, similar to array indexing in many programming languages
-h, --help Prints help information
-n, --non-greedy Use non-greedy field separators
-V, --version Prints version information
2019-09-12 02:57:44 +00:00
OPTIONS:
2020-06-02 19:09:00 +00:00
-f, --field-separator <field-separator>
Specify field separator other than whitespace, using Rust `regex` syntax
-i, --input <input> Input file
-o, --output-field-separator <output-field-separator> Specify output field separator
2019-09-12 02:57:44 +00:00
ARGS:
<choices>... Fields to print. Either a, a:b, a..b, or a..=b, where a and b are integers. The beginning or end
of a range can be omitted, resulting in including the beginning or end of the line,
respectively. a:b is inclusive of b (unless overridden by -x). a..b is exclusive of b and a..=b
is inclusive of b
```
### Examples
```bash
choose 5 # print the 5th item from a line (zero indexed)
choose -f ':' 0 3 5 # print the 0th, 3rd, and 5th item from a line, where
# items are separated by ':' instead of whitespace
choose 2:5 # print everything from the 2nd to 5th item on the line,
2020-04-02 19:25:47 +00:00
# inclusive of the 5th
2020-04-01 20:32:19 +00:00
choose -x 2:5 # print everything from the 2nd to 5th item on the line,
# exclusive of the 5th
choose :3 # print the beginning of the line to the 3rd item
choose -x :3 # print the beginning of the line to the 3rd item,
# exclusive
choose 3: # print the third item to the end of the line
2020-04-01 20:32:19 +00:00
choose -1 # print the last item from a line
choose -3:-1 # print the last three items from a line
```
2019-09-17 21:17:43 +00:00
## Compilation and Installation
2020-04-04 18:00:59 +00:00
### Installing From Source
2020-04-01 20:32:19 +00:00
In order to build `choose` you will need the rust toolchain installed. You can
find instructions [here](https://www.rust-lang.org/tools/install).
2019-09-17 21:17:43 +00:00
Then, to install:
```bash
2020-04-13 17:45:30 +00:00
$ git clone https://github.com/theryangeary/choose.git
$ cd choose
$ cargo build --release
$ install target/release/choose <DESTDIR>
2019-09-17 21:17:43 +00:00
```
Just make sure DESTDIR is in your path.
2020-04-01 20:32:19 +00:00
2020-04-04 18:00:59 +00:00
### Installing From Package Managers
2020-06-08 03:12:19 +00:00
Cargo:
```
$ cargo install choose
```
2020-04-13 17:45:30 +00:00
Arch Linux:
```
$ yay -S choose-rust-git
```
2020-04-04 18:00:59 +00:00
Fedora/CentOS [COPR](https://copr.fedorainfracloud.org/coprs/atim/choose/):
2020-04-04 12:38:11 +00:00
2020-04-04 18:00:59 +00:00
```
2020-04-13 17:45:30 +00:00
$ dnf copr enable atim/choose
$ dnf install choose
2020-04-04 18:00:59 +00:00
```
2020-04-04 12:38:11 +00:00
2020-04-01 20:32:19 +00:00
### Benchmarking
Benchmarking is performed using the [`bench` utility](https://github.com/Gabriel439/bench).
Benchmarking is based on the assumption that there are five files in `test/`
that match the glob "long*txt". GitHub doesn't support files big enough in
normal repos, but for reference the files I'm working with have lengths like
these:
```
1000 test/long.txt
19272 test/long_long.txt
96360 test/long_long_long.txt
963600 test/long_long_long_long.txt
10599600 test/long_long_long_long_long.txt
```
and content generally like this:
```
Those an equal point no years do. Depend warmth fat but her but played. Shy and
subjects wondered trifling pleasant. Prudent cordial comfort do no on colonel as
assured chicken. Smart mrs day which begin. Snug do sold mr it if such.
Terminated uncommonly at at estimating. Man behaviour met moonlight extremity
acuteness direction.
Ignorant branched humanity led now marianne too strongly entrance. Rose to shew
bore no ye of paid rent form. Old design are dinner better nearer silent excuse.
She which are maids boy sense her shade. Considered reasonable we affronting on
expression in. So cordial anxious mr delight. Shot his has must wish from sell
```
2020-04-04 12:38:11 +00:00