Commit graph

3 commits

Author SHA1 Message Date
Wind
a15462fd00
Change default algorithm in detect columns (#12277)
# Description
@fdncred found another histogram based algorithm to detect columns, and
rewrite it in rust: https://github.com/fdncred/guess-width

I have tested it manually, and it works good with `df`, `docker ps`,
`^ps`. This pr is going to use the algorithm in `detect columns`

Fix: #4183

The pitfall of new algorithm:
1. it may not works well if there isn't too much rows of input
2. it may not works well if the length of value is less than the header
to value, e.g:
```
c1 c2 c3 c4 c5
a b c d e
g h i j k
g a a q d
a v c q q | detect columns
```
In this case, users might need to use ~~`--old`~~ `--legacy` to make it
works well.

# User-Facing Changes
User might need to add ~~`--old`~~ `--legacy` to scripts if they find
`detect columns` in their scripts broken.

# Tests + Formatting
Done

# After Submitting
NaN
2024-03-26 13:57:55 +08:00
Hofer-Julian
d0dc6986dd
Use long options for string (#10777) 2023-10-19 22:08:09 +02:00
mengsuenyan
cdc4fb1011
fix #9653 the cmd detect columns with the flag -c (#9667)
fix `detect columns` with flag `-c, --combine-columns` run failed when
using some range

- fixes #9653 

fix #9653 the cmd detect columns with the flag -c, --combine-columns run
failed when using some range.

add unit test for the command `detect columns`

```text
Attempt to automatically split text into multiple columns.

Usage:
  > detect columns {flags} 

Flags:
  -h, --help - Display the help message for this command
  -s, --skip <Int> - number of rows to skip before detecting
  -n, --no-headers - don't detect headers
  -c, --combine-columns <Range> - columns to be combined; listed as a range

Signatures:
  <string> | detect columns -> <table>

Examples:
  Splits string across multiple columns
  > 'a b c' | detect columns -n
  ╭───┬─────────┬─────────┬─────────╮
  │ # │ column0 │ column1 │ column2 │
  ├───┼─────────┼─────────┼─────────┤
  │ 0 │ a       │ b       │ c       │
  ╰───┴─────────┴─────────┴─────────╯

  Splits a multi-line string into columns with headers detected
  > $'c1 c2 c3 c4 c5(char nl)a b c d e' | detect columns
  ╭───┬────┬────┬────┬────┬────╮
  │ # │ c1 │ c2 │ c3 │ c4 │ c5 │
  ├───┼────┼────┼────┼────┼────┤
  │ 0 │ a  │ b  │ c  │ d  │ e  │
  ╰───┴────┴────┴────┴────┴────╯

  
  > $'c1 c2 c3 c4 c5(char nl)a b c d e' | detect columns -c 0..1
  ╭───┬─────┬────┬────┬────╮
  │ # │ c1  │ c3 │ c4 │ c5 │
  ├───┼─────┼────┼────┼────┤
  │ 0 │ a b │ c  │ d  │ e  │
  ╰───┴─────┴────┴────┴────╯

  Splits a multi-line string into columns with headers detected
  > $'c1 c2 c3 c4 c5(char nl)a b c d e' | detect columns -c -2..-1
  ╭───┬────┬────┬────┬─────╮
  │ # │ c1 │ c2 │ c3 │ c4  │
  ├───┼────┼────┼────┼─────┤
  │ 0 │ a  │ b  │ c  │ d e │
  ╰───┴────┴────┴────┴─────╯

  Splits a multi-line string into columns with headers detected
  > $'c1 c2 c3 c4 c5(char nl)a b c d e' | detect columns -c 2..
  ╭───┬────┬────┬───────╮
  │ # │ c1 │ c2 │  c3   │
  ├───┼────┼────┼───────┤
  │ 0 │ a  │ b  │ c d e │
  ╰───┴────┴────┴───────╯

  Parse external ls command and combine columns for datetime
  > ^ls -lh | detect columns --no-headers --skip 1 --combine-columns 5..7

```
2023-07-21 08:25:06 -05:00