# Description
Closes#12535
Implements sort-by functionality of #8322
Fixes sort-by part of #8667
This PR does two main things: add a new cell path and closure parameter
to `sort-by`, and attempt to make Nushell's sorting behavior
well-defined.
## `sort-by` features
The `columns` parameter is replaced with a `comparator` parameter, which
can be a cell path or a closure. Examples are from docs PR.
1. Cell paths
The basic interactive usage of `sort-by` is the same. For example, `ls |
sort-by modified` still works the same as before. It is not quite a
drop-in replacement, see [behavior changes](#behavior-changes).
Here's an example of how the cell path comparator might be useful:
```nu
> let cities = [
{name: 'New York', info: { established: 1624, population: 18_819_000 } }
{name: 'Kyoto', info: { established: 794, population: 37_468_000 } }
{name: 'São Paulo', info: { established: 1554, population: 21_650_000 }
}
]
> $cities | sort-by info.established
╭───┬───────────┬────────────────────────────╮
│ # │ name │ info │
├───┼───────────┼────────────────────────────┤
│ 0 │ Kyoto │ ╭─────────────┬──────────╮ │
│ │ │ │ established │ 794 │ │
│ │ │ │ population │ 37468000 │ │
│ │ │ ╰─────────────┴──────────╯ │
│ 1 │ São Paulo │ ╭─────────────┬──────────╮ │
│ │ │ │ established │ 1554 │ │
│ │ │ │ population │ 21650000 │ │
│ │ │ ╰─────────────┴──────────╯ │
│ 2 │ New York │ ╭─────────────┬──────────╮ │
│ │ │ │ established │ 1624 │ │
│ │ │ │ population │ 18819000 │ │
│ │ │ ╰─────────────┴──────────╯ │
╰───┴───────────┴────────────────────────────╯
```
2. Key closures
You can supply a closure which will transform each value into a sorting
key (without changing the underlying data). Here's an example of a key
closure, where we want to sort a list of assignments by their average
grade:
```nu
> let assignments = [
{name: 'Homework 1', grades: [97 89 86 92 89] }
{name: 'Homework 2', grades: [91 100 60 82 91] }
{name: 'Exam 1', grades: [78 88 78 53 90] }
{name: 'Project', grades: [92 81 82 84 83] }
]
> $assignments | sort-by { get grades | math avg }
╭───┬────────────┬───────────────────────╮
│ # │ name │ grades │
├───┼────────────┼───────────────────────┤
│ 0 │ Exam 1 │ [78, 88, 78, 53, 90] │
│ 1 │ Project │ [92, 81, 82, 84, 83] │
│ 2 │ Homework 2 │ [91, 100, 60, 82, 91] │
│ 3 │ Homework 1 │ [97, 89, 86, 92, 89] │
╰───┴────────────┴───────────────────────╯
```
3. Custom sort closure
The `--custom`, or `-c`, flag will tell `sort-by` to interpret closures
as custom sort closures. A custom sort closure has two parameters, and
returns a boolean. The closure should return `true` if the first
parameter comes _before_ the second parameter in the sort order.
For a simple example, we could rewrite a cell path sort as a custom sort
(see
[here](https://github.com/nushell/nushell.github.io/pull/1568/files#diff-a7a233e66a361d8665caf3887eb71d4288000001f401670c72b95cc23a948e86R231)
for a more complex example):
```nu
> ls | sort-by -c {|a, b| $a.size < $b.size }
╭───┬─────────────────────┬──────┬──────────┬────────────────╮
│ # │ name │ type │ size │ modified │
├───┼─────────────────────┼──────┼──────────┼────────────────┤
│ 0 │ my-secret-plans.txt │ file │ 100 B │ 10 minutes ago │
│ 1 │ shopping_list.txt │ file │ 100 B │ 2 months ago │
│ 2 │ myscript.nu │ file │ 1.1 KiB │ 2 weeks ago │
│ 3 │ bigfile.img │ file │ 10.0 MiB │ 3 weeks ago │
╰───┴─────────────────────┴──────┴──────────┴────────────────╯
```
## Making sort more consistent
I think it's important for something as essential as `sort` to have
well-defined semantics. This PR contains some changes to try to make the
behavior of `sort` and `sort-by` consistent. In addition, after working
with the internals of sorting code, I have a much deeper understanding
of all of the edge cases. Here is my attempt to try to better define
some of the semantics of sorting (if you are just interested in changes,
skip to "User-Facing changes")
- `sort`, `sort -v`, and `sort-by` now all work the same. Each
individual sort implementation has been refactored into two functions in
`sort_utils.rs`: `sort`, and `sort_by`. These can also be used in other
parts of Nushell where values need to be sorted.
- `sort` and `sort-by` used to handle `-i` and `-n` differently.
- `sort -n` would consider all values which can't be coerced into a
string to be equal
- `sort-by -i` and `sort-by -n` would only work if all values were
strings
- In this PR, insensitive sort only affects comparison between strings,
and natural sort only applies to numbers and strings (see below).
- (not a change) Before and after this PR, `sort` and `sort-by` support
sorting mixed types. There was a lot of discussion about potentially
making `sort` and `sort-by` only work on lists of homogeneous types, but
the general consensus was that `sort` should not error just because its
input contains incompatible types.
- In order to try to make working with data containing `null` values
easier, I changed the PartialOrd order to sort `Nothing` values to the
end of a list, regardless of what other types the list contains. Before,
`null` would be sorted before `Binary`, `CellPath`, and `Custom` values.
- (not a change) When sorted, lists of mixed types will contain sorted
values of each type in order, for the most part
- (not a change) For example, `[0x[1] (date now) "a" ("yesterday" | into
datetime) "b" 0x[0]]` will be sorted as `["a", "b", a day ago, now, [0],
[1]]`, where sorted strings appear first, then sorted datetimes, etc.
- (not a change) The exception to this is `Int`s and `Float`s, which
will intermix, `Strings` and `Glob`s, which will intermix, and `None` as
described above. Additionally, natural sort will intermix strings with
ints and floats (see below).
- Natural sort no longer coerce all inputs to strings.
- I did originally make natural only apply to strings, but @fdncred
pointed out that the previous behavior also allowed you to sort numeric
strings with numbers. This seems like a useful feature if we are trying
to support sorting with mixed types, so I settled on coercing only
numbers (int, float). This can be reverted if people don't like it.
- Here is an example of this behavior in action, which is the same
before and after this PR:
```nushell
$ [1 "4" 3 "2"] | sort --natural
╭───┬───╮
│ 0 │ 1 │
│ 1 │ 2 │
│ 2 │ 3 │
│ 3 │ 4 │
╰───┴───╯
```
# User-Facing Changes
## New features
- Replaces the `columns` string parameter of `sort-by` with a cell path
or a closure.
- The cell path parameter works exactly as you would expect
- By default, the `closure` parameter acts as a "key sort"; that is,
each element is transformed by the closure into a sorting key
- With the `--custom` (`-c`) parameter, you can define a comparison
function for completely custom sorting order.
## Behavior changes
<details>
<summary><code>sort -v</code> does not coerce record values to
strings</summary>
This was a bit of a surprising behavior, and is now unified with the
behavior of `sort` and `sort-by`. Here's an example where you can
observe the values being implicitly coerced into strings for sorting, as
they are sorted like strings rather than numbers:
Old behavior:
```nushell
$ {foo: 9 bar: 10} | sort -v
╭─────┬────╮
│ bar │ 10 │
│ foo │ 9 │
╰─────┴────╯
```
New behavior:
```nushell
$ {foo: 9 bar: 10} | sort -v
╭─────┬────╮
│ foo │ 9 │
│ bar │ 10 │
╰─────┴────╯
```
</details>
<details>
<summary>Changed <code>sort-by</code> parameters from
<code>string</code> to <code>cell-path</code> or <code>closure</code>.
Typical interactive usage is the same as before, but if passing a
variable to <code>sort-by</code> it must be a cell path (or closure),
not a string</summary>
Old behavior:
```nushell
$ let sort = "modified"
$ ls | sort-by $sort
╭───┬──────┬──────┬──────┬────────────────╮
│ # │ name │ type │ size │ modified │
├───┼──────┼──────┼──────┼────────────────┤
│ 0 │ foo │ file │ 0 B │ 10 hours ago │
│ 1 │ bar │ file │ 0 B │ 35 seconds ago │
╰───┴──────┴──────┴──────┴────────────────╯
```
New behavior:
```nushell
$ let sort = "modified"
$ ls | sort-by $sort
Error: nu:🐚:type_mismatch
× Type mismatch.
╭─[entry #10:1:14]
1 │ ls | sort-by $sort
· ──┬──
· ╰── Cannot sort using a value which is not a cell path or closure
╰────
$ let sort = $."modified"
$ ls | sort-by $sort
╭───┬──────┬──────┬──────┬───────────────╮
│ # │ name │ type │ size │ modified │
├───┼──────┼──────┼──────┼───────────────┤
│ 0 │ foo │ file │ 0 B │ 10 hours ago │
│ 1 │ bar │ file │ 0 B │ 2 minutes ago │
╰───┴──────┴──────┴──────┴───────────────╯
```
</details>
<details>
<summary>Insensitve and natural sorting behavior reworked</summary>
Previously, the `-i` and `-n` worked differently for `sort` and
`sort-by` (see "Making sort more consistent"). Here are examples of how
these options result in different sorts now:
1. `sort -n`
- Old behavior (types other than numbers, strings, dates, and binary
sorted incorrectly)
```nushell
$ [2sec 1sec] | sort -n
╭───┬──────╮
│ 0 │ 2sec │
│ 1 │ 1sec │
╰───┴──────╯
```
- New behavior
```nushell
$ [2sec 1sec] | sort -n
╭───┬──────╮
│ 0 │ 1sec │
│ 1 │ 2sec │
╰───┴──────╯
```
2. `sort-by -i`
- Old behavior (uppercase words appear before lowercase words as they
would in a typical sort, indicating this is not actually an insensitive
sort)
```nushell
$ ["BAR" "bar" "foo" 2 "FOO" 1] | wrap a | sort-by -i a
╭───┬─────╮
│ # │ a │
├───┼─────┤
│ 0 │ 1 │
│ 1 │ 2 │
│ 2 │ BAR │
│ 3 │ FOO │
│ 4 │ bar │
│ 5 │ foo │
╰───┴─────╯
```
- New behavior (strings are sorted stably, indicating this is an
insensitive sort)
```nushell
$ ["BAR" "bar" "foo" 2 "FOO" 1] | wrap a | sort-by -i a
╭───┬─────╮
│ # │ a │
├───┼─────┤
│ 0 │ 1 │
│ 1 │ 2 │
│ 2 │ BAR │
│ 3 │ bar │
│ 4 │ foo │
│ 5 │ FOO │
╰───┴─────╯
```
3. `sort-by -n`
- Old behavior (natural sort does not work when data contains non-string
values)
```nushell
$ ["10" 8 "9"] | wrap a | sort-by -n a
╭───┬────╮
│ # │ a │
├───┼────┤
│ 0 │ 8 │
│ 1 │ 10 │
│ 2 │ 9 │
╰───┴────╯
```
- New behavior
```nushell
$ ["10" 8 "9"] | wrap a | sort-by -n a
╭───┬────╮
│ # │ a │
├───┼────┤
│ 0 │ 8 │
│ 1 │ 9 │
│ 2 │ 10 │
╰───┴────╯
```
</details>
<details>
<summary>
Sorting a list of non-record values with a non-existent column/path now
errors instead of sorting the values directly (<code>sort</code> should
be used for this, not <code>sort-by</code>)
</summary>
Old behavior:
```nushell
$ [2 1] | sort-by foo
╭───┬───╮
│ 0 │ 1 │
│ 1 │ 2 │
╰───┴───╯
```
New behavior:
```nushell
$ [2 1] | sort-by foo
Error: nu:🐚:incompatible_path_access
× Data cannot be accessed with a cell path
╭─[entry #29:1:17]
1 │ [2 1] | sort-by foo
· ─┬─
· ╰── int doesn't support cell paths
╰────
```
</details>
<details>
<summary><code>sort</code> and <code>sort-by</code> output
<code>List</code> instead of <code>ListStream</code> </summary>
This isn't a meaningful change (unless I misunderstand the purpose of
ListStream), since `sort` and `sort-by` both need to collect in order to
do the sorting anyway, but is user observable.
Old behavior:
```nushell
$ ls | sort | describe -d
╭──────────┬───────────────────╮
│ type │ stream │
│ origin │ nushell │
│ subtype │ {record 3 fields} │
│ metadata │ {record 1 field} │
╰──────────┴───────────────────╯
```
```nushell
$ ls | sort-by name | describe -d
╭──────────┬───────────────────╮
│ type │ stream │
│ origin │ nushell │
│ subtype │ {record 3 fields} │
│ metadata │ {record 1 field} │
╰──────────┴───────────────────╯
```
New behavior:
```nushell
ls | sort | describe -d
╭────────┬─────────────────╮
│ type │ list │
│ length │ 22 │
│ values │ [table 22 rows] │
╰────────┴─────────────────╯
```
```nushell
$ ls | sort-by name | describe -d
╭────────┬─────────────────╮
│ type │ list │
│ length │ 22 │
│ values │ [table 22 rows] │
╰────────┴─────────────────╯
```
</details>
- `sort` now errors when nothing is piped in (`sort-by` already did
this)
# Tests + Formatting
I added lots of unit tests on the new sort implementation to enforce new
sort behaviors and prevent regressions.
# After Submitting
See [docs PR](https://github.com/nushell/nushell.github.io/pull/1568),
which is ~2/3 finished.
---------
Co-authored-by: NotTheDr01ds <32344964+NotTheDr01ds@users.noreply.github.com>
Co-authored-by: Ian Manske <ian.manske@pm.me>
# Description
The meaning of the word usage is specific to describing how a command
function is *used* and not a synonym for general description. Usage can
be used to describe the SYNOPSIS or EXAMPLES sections of a man page
where the permitted argument combinations are shown or example *uses*
are given.
Let's not confuse people and call it what it is a description.
Our `help` command already creates its own *Usage* section based on the
available arguments and doesn't refer to the description with usage.
# User-Facing Changes
`help commands` and `scope commands` will now use `description` or
`extra_description`
`usage`-> `description`
`extra_usage` -> `extra_description`
Breaking change in the plugin protocol:
In the signature record communicated with the engine.
`usage`-> `description`
`extra_usage` -> `extra_description`
The same rename also takes place for the methods on
`SimplePluginCommand` and `PluginCommand`
# Tests + Formatting
- Updated plugin protocol specific changes
# After Submitting
- [ ] update plugin protocol doc
# Description
Removes the old `nu-cmd-dataframe` crate in favor of the polars plugin.
As such, this PR also removes the `dataframe` feature, related CI, and
full releases of nushell.
# Description
This updates all the positional arguments (except with
`--features=dataframe` or `--features=extra`) to start with an uppercase
letter and end with a period.
Part of #5066, specifically [this
comment](/nushell/nushell/issues/5066#issuecomment-1421528910)
Some arguments had example data removed from them because it also
appears in the examples.
There are other inconsistencies in positional arguments I noticed while
making the tests pass which I will bring up in #5066.
# User-Facing Changes
Positional arguments are now consistent
# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
# After Submitting
Automatic documentation updates
# Description
This repeats #8268 to make all command usage strings start with an
uppercase letter and end with a period per #5056
Adds a test to ensure that commands won't regress
Part of #5066
# User-Facing Changes
Command usage is now consistent
# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
# After Submitting
Automatic documentation updates
# Description
In the past we named the process of completely removing a command and
providing a basic error message pointing to the new alternative
"deprecation".
But this doesn't match the expectation of most users that have seen
deprecation _warnings_ that alert to either impending removal or
discouraged use after a stability promise.
# User-Facing Changes
Command category changed from `deprecated` to `removed`
# Description
Make sure that our different crates that contain commands can be
compiled in parallel.
This can under certain circumstances accelerate the compilation with
sufficient multithreading available.
## Details
- Move `help` commands from `nu-cmd-lang` back to `nu-command`
- This also makes sense as the commands are implemented in an
ANSI-terminal specific way
- Make `nu-cmd-lang` only a dev dependency for `nu-command`
- Change context creation helpers for `nu-cmd-extra` and
`nu-cmd-dataframe` to have a consistent api used in
`src/main.rs`:`get_engine_state()`
- `nu-command` now indepedent from `nu-cmd-extra` and `nu-cmd-dataframe`
that are now dependencies of `nu` directly. (change to internal
features)
- Fix tests that previously used `nu-command::create_default_context()`
with replacement functions
## From scratch compilation times:
just debug (dev) build and default features
```
cargo clean --profile dev && cargo build --timings
```
### before
![grafik](https://github.com/nushell/nushell/assets/15833959/e49f1f42-2e53-4a6c-bc23-625b686af1bc)
### after
![grafik](https://github.com/nushell/nushell/assets/15833959/8dec4723-e625-4a86-b91e-e6e808f64726)
# User-Facing Changes
None direct, only change to compilation on multithreaded jobs expected.
# Tests + Formatting
Tests that previously chose to use `nu-command` for their scope will
still use `nu-cmd-lang` + `nu-command` (command list in the granularity
at the time)
# Description
This does a lookup in the cache of parsed files to see if a span can be
found for a file that was previously loaded with the same contents, then
uses that span to find the parsed block for that file. The end result
should, in theory, be identical but doesn't require any reparsing or
creating new blocks/new definitions that aren't needed.
This drops the sg.nu benchmark from:
```
╭───┬───────────────────╮
│ 0 │ 280ms 606µs 208ns │
│ 1 │ 282ms 654µs 416ns │
│ 2 │ 252ms 640µs 541ns │
│ 3 │ 250ms 940µs 41ns │
│ 4 │ 241ms 216µs 375ns │
│ 5 │ 257ms 310µs 583ns │
│ 6 │ 196ms 739µs 416ns │
╰───┴───────────────────╯
```
to:
```
╭───┬───────────────────╮
│ 0 │ 118ms 698µs 125ns │
│ 1 │ 121ms 327µs │
│ 2 │ 121ms 873µs 500ns │
│ 3 │ 124ms 94µs 708ns │
│ 4 │ 113ms 733µs 291ns │
│ 5 │ 108ms 663µs 125ns │
│ 6 │ 63ms 482µs 625ns │
╰───┴───────────────────╯
```
I was hoping to also see some startup time improvements, but I didn't
notice much there.
# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- crates/nu-std/tests/run.nu` to run the tests for the
standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
# Description
We were seeing duplicate entries for the std lib files, and this PR
addresses that. Each file should now only be added once.
Note: they are still parsed twice because it's hard to recover the
module from the output of `parse` but a bit of clever hacking in a
future PR might be able to do that.
# User-Facing Changes
_(List of all changes that impact the user experience here. This helps
us keep track of breaking changes.)_
# Tests + Formatting
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- crates/nu-std/tests/run.nu` to run the tests for the
standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
# After Submitting
If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
* Test commands for proper names and search terms
Assert that the `Command.name()` is equal to `Signature.name`
Check that search terms are not just substrings of the command name as
they would not help finding the command.
* Clean up search terms
Remove redundant terms that just replicate the command name.
Try to eliminate substring between search terms, clean up where
necessary.
* Remove comment
* Split delta and environment merging
* Move table mode to a more logical place
* Cleanup
* Merge environment after reading default_env.nu
* Fmt
* move commands, futures.rs, script.rs, utils
* move over maybe_print_errors
* add nu_command crate references to nu_cli
* in commands.rs open up to pub mod from pub(crate)
* nu-cli, nu-command, and nu tests are now passing
* cargo fmt
* clean up nu-cli/src/prelude.rs
* code cleanup
* for some reason lex.rs was not formatted, may be causing my error
* remove mod completion from lib.rs which was not being used along with quickcheck macros
* add in allow unused imports
* comment out one failing external test; comment out one failing internal test
* revert commenting out failing tests; something else might be going on; someone with a windows machine should check and see what is going on with these failing windows tests
* Update Cargo.toml
Extend the optional features to nu-command
Co-authored-by: Jonathan Turner <jonathandturner@users.noreply.github.com>
2021-01-12 17:59:53 +13:00
Renamed from crates/nu-cli/tests/main.rs (Browse further)