nushell

9709 commits 2 branches 108 tags 112 MiB

Author	SHA1	Message	Date
Jess	9daa5f9177	Fix silent failure of parsing input output types (#14510 ) - This PR should fix/close: - #11266 - #12893 - #13736 - #13748 - #14170 - It doesn't fix #13736 though unfortunately. The issue there is at a different level to this fix (I think probably in the lexing somewhere, which I haven't touched). # The Problem The linked issues have many examples of the problem and the related confusion it causes, but I'll give some more examples here for illustration. It boils down to the following: This doesn't type check (good): ```nu def foo []: string -> int { false } ``` This does (bad): ```nu def foo [] : string -> int { false } ``` Because the parser is completely ignoring all the characters. This also compiles in 0.100.0: ```nu def blue [] Da ba Dee da Ba da { false } ``` And this also means commands which have a completely fine type, but an extra space before `:`, lose that type information and end up as `any -> any`, e.g. ```nu def foo [] : int -> int {$in + 3} ``` ```bash $ foo --help Input/output types: ╭───┬───────┬────────╮ │ # │ input │ output │ ├───┼───────┼────────┤ │ 0 │ any │ any │ ╰───┴───────┴────────╯ ``` # The Fix Special thank you to @texastoland whose draft PR (#12358) I referenced heavily while making this fix. That PR seeks to fix the invalid parsing by disallowing whitespace between `[]` and `:` in declarations, e.g. `def foo [] : int -> any {}` This PR instead allows the whitespace while properly parsing the type signature. I think this is the better choice for a few reasons: - The parsing is still straightforward and the information is all there anyway, - It's more consistent with type annotations in other places, e.g. `do {\|nums : list<int>\| $nums \| describe} [ 1 2 3 ]` from the [Type Signatures doc page](https://www.nushell.sh/lang-guide/chapters/types/type_signatures.html) - It's more consistent with the new nu parser, which allows `let x : bool = false` (current nu doesn't, but this PR doesn't change that) - It will be less disruptive and should only break code where the types are actually wrong (if your types were correct, but you had a space before the `:`, those declarations will still compile and now have more type information vs. throwing an error in all cases and requiring spaces to be deleted) - It's the more intuitive syntax for most functional programmers like myself (haskell/lean/coq/agda and many more either allow or require whitespace for type annotations) I don't use Rust a lot, so I tried to keep most things the same and the rest I wrote as if it was Haskell (if you squint a bit). Code review/suggestions very welcome. I added all the tests I could think of and `toolkit check pr` gives it the all-clear. # User-Facing Changes This PR meets part of the goal of #13849, but doesn't do anything about parsing signatures twice and doesn't do much to improve error messages, it just enforces the existing errors and error messages. This will no doubt be a breaking change, mostly because the code is already broken and users don't realise yet (one of my personal scripts stopped compiling after this fix because I thought `def foo [] -> string {}` was valid syntax). It shouldn't break any type-correct code though.	2024-12-07 09:55:15 -06:00
Bahex	b6e84879b6	add multiple grouper support to `group-by` (#14337 ) - closes #14330 Related: - #2607 - #14019 - #14316 # Description This PR changes `group-by` to support grouping by multiple `grouper` arguments. # Changes - No grouper: no change in behavior - Single grouper - `--to-table=false`: no change in behavior - `--to-table=true`: - closure grouper: named group0 - cell-path grouper: named after the cell-path - Multiple groupers: - `--to-table=false`: nested groups - `--to-table=true`: one column for each grouper argument, followed by the `items` column - columns corresponding to cell-paths are named after them - columns corresponding to closure groupers are named `group{i}` where `i` is the index of the grouper argument # Examples ```nushell > [1 3 1 3 2 1 1] \| group-by ╭───┬───────────╮ │ │ ╭───┬───╮ │ │ 1 │ │ 0 │ 1 │ │ │ │ │ 1 │ 1 │ │ │ │ │ 2 │ 1 │ │ │ │ │ 3 │ 1 │ │ │ │ ╰───┴───╯ │ │ │ ╭───┬───╮ │ │ 3 │ │ 0 │ 3 │ │ │ │ │ 1 │ 3 │ │ │ │ ╰───┴───╯ │ │ │ ╭───┬───╮ │ │ 2 │ │ 0 │ 2 │ │ │ │ ╰───┴───╯ │ ╰───┴───────────╯ > [1 3 1 3 2 1 1] \| group-by --to-table ╭─#─┬─group─┬───items───╮ │ 0 │ 1 │ ╭───┬───╮ │ │ │ │ │ 0 │ 1 │ │ │ │ │ │ 1 │ 1 │ │ │ │ │ │ 2 │ 1 │ │ │ │ │ │ 3 │ 1 │ │ │ │ │ ╰───┴───╯ │ │ 1 │ 3 │ ╭───┬───╮ │ │ │ │ │ 0 │ 3 │ │ │ │ │ │ 1 │ 3 │ │ │ │ │ ╰───┴───╯ │ │ 2 │ 2 │ ╭───┬───╮ │ │ │ │ │ 0 │ 2 │ │ │ │ │ ╰───┴───╯ │ ╰─#─┴─group─┴───items───╯ > [1 3 1 3 2 1 1] \| group-by { $in >= 2 } ╭───────┬───────────╮ │ │ ╭───┬───╮ │ │ false │ │ 0 │ 1 │ │ │ │ │ 1 │ 1 │ │ │ │ │ 2 │ 1 │ │ │ │ │ 3 │ 1 │ │ │ │ ╰───┴───╯ │ │ │ ╭───┬───╮ │ │ true │ │ 0 │ 3 │ │ │ │ │ 1 │ 3 │ │ │ │ │ 2 │ 2 │ │ │ │ ╰───┴───╯ │ ╰───────┴───────────╯ > [1 3 1 3 2 1 1] \| group-by { $in >= 2 } --to-table ╭─#─┬─group0─┬───items───╮ │ 0 │ false │ ╭───┬───╮ │ │ │ │ │ 0 │ 1 │ │ │ │ │ │ 1 │ 1 │ │ │ │ │ │ 2 │ 1 │ │ │ │ │ │ 3 │ 1 │ │ │ │ │ ╰───┴───╯ │ │ 1 │ true │ ╭───┬───╮ │ │ │ │ │ 0 │ 3 │ │ │ │ │ │ 1 │ 3 │ │ │ │ │ │ 2 │ 2 │ │ │ │ │ ╰───┴───╯ │ ╰─#─┴─group0─┴───items───╯ ``` ```nushell let data = [ [name, lang, year]; [andres, rb, "2019"], [jt, rs, "2019"], [storm, rs, "2021"] ] > $data ╭─#─┬──name──┬─lang─┬─year─╮ │ 0 │ andres │ rb │ 2019 │ │ 1 │ jt │ rs │ 2019 │ │ 2 │ storm │ rs │ 2021 │ ╰─#─┴──name──┴─lang─┴─year─╯ ``` ```nushell > $data \| group-by lang ╭────┬──────────────────────────────╮ │ │ ╭─#─┬──name──┬─lang─┬─year─╮ │ │ rb │ │ 0 │ andres │ rb │ 2019 │ │ │ │ ╰─#─┴──name──┴─lang─┴─year─╯ │ │ │ ╭─#─┬─name──┬─lang─┬─year─╮ │ │ rs │ │ 0 │ jt │ rs │ 2019 │ │ │ │ │ 1 │ storm │ rs │ 2021 │ │ │ │ ╰─#─┴─name──┴─lang─┴─year─╯ │ ╰────┴──────────────────────────────╯ ``` Group column is now named after the grouper, to allow multiple groupers. ```nushell > $data \| group-by lang --to-table # column names changed! ╭─#─┬─lang─┬────────────items─────────────╮ │ 0 │ rb │ ╭─#─┬──name──┬─lang─┬─year─╮ │ │ │ │ │ 0 │ andres │ rb │ 2019 │ │ │ │ │ ╰─#─┴──name──┴─lang─┴─year─╯ │ │ 1 │ rs │ ╭─#─┬─name──┬─lang─┬─year─╮ │ │ │ │ │ 0 │ jt │ rs │ 2019 │ │ │ │ │ │ 1 │ storm │ rs │ 2021 │ │ │ │ │ ╰─#─┴─name──┴─lang─┴─year─╯ │ ╰─#─┴─lang─┴────────────items─────────────╯ ``` Grouping by multiple columns makes finer grained aggregations possible. ```nushell > $data \| group-by lang year --to-table ╭─#─┬─lang─┬─year─┬────────────items─────────────╮ │ 0 │ rb │ 2019 │ ╭─#─┬──name──┬─lang─┬─year─╮ │ │ │ │ │ │ 0 │ andres │ rb │ 2019 │ │ │ │ │ │ ╰─#─┴──name──┴─lang─┴─year─╯ │ │ 1 │ rs │ 2019 │ ╭─#─┬─name─┬─lang─┬─year─╮ │ │ │ │ │ │ 0 │ jt │ rs │ 2019 │ │ │ │ │ │ ╰─#─┴─name─┴─lang─┴─year─╯ │ │ 2 │ rs │ 2021 │ ╭─#─┬─name──┬─lang─┬─year─╮ │ │ │ │ │ │ 0 │ storm │ rs │ 2021 │ │ │ │ │ │ ╰─#─┴─name──┴─lang─┴─year─╯ │ ╰─#─┴─lang─┴─year─┴────────────items─────────────╯ ``` Grouping by multiple columns, without `--to-table` returns a nested structure. This is equivalent to `$data \| group-by year \| split-by lang`, making `split-by` obsolete. ```nushell > $data \| group-by lang year ╭────┬─────────────────────────────────────────╮ │ │ ╭──────┬──────────────────────────────╮ │ │ rb │ │ │ ╭─#─┬──name──┬─lang─┬─year─╮ │ │ │ │ │ 2019 │ │ 0 │ andres │ rb │ 2019 │ │ │ │ │ │ │ ╰─#─┴──name──┴─lang─┴─year─╯ │ │ │ │ ╰──────┴──────────────────────────────╯ │ │ │ ╭──────┬─────────────────────────────╮ │ │ rs │ │ │ ╭─#─┬─name─┬─lang─┬─year─╮ │ │ │ │ │ 2019 │ │ 0 │ jt │ rs │ 2019 │ │ │ │ │ │ │ ╰─#─┴─name─┴─lang─┴─year─╯ │ │ │ │ │ │ ╭─#─┬─name──┬─lang─┬─year─╮ │ │ │ │ │ 2021 │ │ 0 │ storm │ rs │ 2021 │ │ │ │ │ │ │ ╰─#─┴─name──┴─lang─┴─year─╯ │ │ │ │ ╰──────┴─────────────────────────────╯ │ ╰────┴─────────────────────────────────────────╯ ``` From #2607: > Here's a couple more examples without much explanation. This one shows adding two grouping keys. I'm always wanting to add more columns when using group-by and it just-work™️ `gb.exe -f movies-2.csv -k 3,2 -s 7 --skip_header` > > ``` > k:3 \| k:2 \| count \| sum:7 > -----------------------+-----------+-------+-------------------- > 20th Century Fox \| Drama \| 1 \| 117.09 > 20th Century Fox \| Romance \| 1 \| 39.66 > CBS \| Comedy \| 1 \| 77.09 > Disney \| Animation \| 4 \| 1264.23 > Disney \| Comedy \| 4 \| 950.27 > Fox \| Comedy \| 5 \| 661.85 > Independent \| Comedy \| 7 \| 399.07 > Independent \| Drama \| 4 \| 69.75 > Independent \| Romance \| 7 \| 1048.75 > Independent \| romance \| 1 \| 29.37 > ... > ``` This example can be achieved like this: ```nushell > open movies-2.csv \| group-by "Lead Studio" Genre --to-table \| insert count {get items \| length} \| insert sum { get items."Worldwide Gross" \| math sum} \| reject items \| sort-by "Lead Studio" Genre ╭─#──┬──────Lead Studio──────┬───Genre───┬─count─┬───sum───╮ │ 0 │ 20th Century Fox │ Drama │ 1 │ 117.09 │ │ 1 │ 20th Century Fox │ Romance │ 1 │ 39.66 │ │ 2 │ CBS │ Comedy │ 1 │ 77.09 │ │ 3 │ Disney │ Animation │ 4 │ 1264.23 │ │ 4 │ Disney │ Comedy │ 4 │ 950.27 │ │ 5 │ Fox │ Comedy │ 5 │ 661.85 │ │ 6 │ Fox │ comedy │ 1 │ 60.72 │ │ 7 │ Independent │ Comedy │ 7 │ 399.07 │ │ 8 │ Independent │ Drama │ 4 │ 69.75 │ │ 9 │ Independent │ Romance │ 7 │ 1048.75 │ │ 10 │ Independent │ romance │ 1 │ 29.37 │ ... ```	2024-11-15 06:40:49 -06:00
Douglas	00709fc5bd	Improves startup time when using std-lib (#13842 ) Updated summary for commit [`612e0e2`](`612e0e2160`) - While folks are welcome to read through the entire comments, the core information is summarized here. # Description This PR drastically improves startup times of Nushell by only parsing a single submodule of the Standard Library that provides the `banner` and `pwd` commands. All other Standard Library commands and submodules are parsed when imported by the user. This cuts startup times by more than 60%. At the moment, we have stopped adding to `std-lib` because every addition adds a small amount to the Nushell startup time. With this change, we should once again be able to allow new functionality to be added to the Standard Library without it impacting `nu` startup times. # User-Facing Changes * Nushell now starts about 60% faster * Breaking change: The `dirs` (Shells) aliases will return a warning message that it will not be auto-loaded in the following release, along with instructions on how to restore it (and disable the message) * The `use std <submodule> ` syntax is available for convenience, but should be avoided in scripts as it parses the entire `std` module and all other submodules and places it in scope. The correct syntax to just* load a submodule is `use std/<submodule> ` (asterisk optional). The slash is important. This will be documented. `use std ` can be used for convenience to load all of the library but still incurs the full loading-time. `std/dirs`: Semi-breaking change. The `dirs` command replaces the `show` command. This is more in line with the directory-stack functionality found in other shells. Existing users will not be impacted by this as the alias (`shells`) remains the same. * Breaking-change: Technically a breaking change, but probably only impacts maintainers of `std`. The virtual path for the standard library has changed. It could previously be imported using its virtual path (and technically, this would have been the correct way to do it): ```nu use NU_STDLIB_VIRTUAL_DIR/std ``` The path is now simply `std/`: ```nu use std ``` All submodules have moved accordingly. # Timings Comparisons below were made: * In a temporary, clean config directory using `$env.XDG_CONFIG_HOME = (mktemp -d)`. * `nu` was run with a release build * `nu` was run one time to generate the default `config.nu` (etc.) files - Otherwise timings would include the user-prompt * The shell was exited and then restarted several times to get timing samples (Note: Old timings based on 0.97 rather than 0.98, but in the range of being accurate) \| Scenario \| `$nu.startup-time` \| \| --- \| --- \| \| 0.97.2 ([`aaaab8e`](`aaaab8e070`)) Without this PR \| 23ms - 24ms \| \| This PR with deprecated commands \| 9ms - <11ms \| \| This PR after deprecated commands are removed in following release \| 8ms - <10ms \| \| Final PR (remove deprecated), using `--no-std-lib` \| 6.1ms to 6.4ms \| \| Final PR (remove deprecated), using `--no-config-file` \| 3.1ms - 3.6ms \| \| Final PR (remove deprecated), using `--no-config-file --no-std-lib` \| 1ms - 1.5ms \| These last two timings point to the opportunity for further optimization (see comment in thread below (will link once I write it). # Implementation details for future maintenance * `use std banner` is a ridiculously deceptive call. That call parses and imports all of `std` into scope. Simply replacing it with `use std/core ` is essentially what saves ~14-15ms. This only* imports the submodule with the `banner` and `pwd` commands. * From the code-comments, the reason that `NU_STDLIB_VIRTUAL_DIR` was used as a prefix was so that there wouldn't be an issue if a user had a `./std/mod.nu` in the current directory. This does not appear to be an issue. After removing the prefix, I tested with both a relative module as well as one in the `$env.NU_LIB_DIRS` path, and in all cases the internal `std` still took precedence. * By removing the prefix, users can now `use std` (and variants) without requiring that it already be parsed and in scope. * In the next release, we'll stop autoloading the `dirs` (shells) functionality. While this only costs an additional 1-1.5ms, I think it's better moved to the `config.nu` where the user can optionally remove it. The main reason is its use of aliases (which have also caused issues) - The `n`, `p`, and `g` short-commands are valuable real-estate, and users may want to map these to something else. For this release, there's an `deprecated_dirs` module that is still autoloaded. As with the top-level commands, use of these will give a deprecation warning with instructions on how to handle going forward. To help with this, moved the aliases to their own submodule inside the `dirs` module. * Also sneaks in a small change where the top-level `dirs` command is now the replacement for `dirs show` * Fixed a double-import of `assert` in `dirs.nu` * The `show_banner` step is replaced with simply `banner` rather than re-importing it. * A `virtual_path` may now be referenced with either a forward-slash or a backward-slash on Windows. This allows `use std/<submodule>` to work on all platforms. # Performance side-notes: * Future parsing and/or IR improvements should improve performance even further. * While the existing load time penalty of `std-lib` was not noticeable on many systems, Nushell runs on a wide-variety of hardware and OS platforms. Slower platforms will naturally see a bigger jump in performance here. For users starting multiple Nushell sessions frequently (e.g., `tmux`, Zellij, `screen`, et. al.) it is recommended to keep total startup time (including user configuration) under ~250ms. # Tests + Formatting * All tests are green * Updated tests: - Removed the test that confirmed that `std` was loaded (since we don't). - Removed the `shells` test since it is not autoloaded. Main `dirs.nu` functionality is tested through `stdlib-test`. - Many tests assumed that the library was fully loaded, because it was (even though we didn't intend for it to be). Fixed those tests. - Tests now import only the necessary submodules (e.g., `use std/assert`, rather than `use std assert`) - Some tests thought they were loading `std/log`, but were doing so improperly. This was masked by the now-fixed "load-everything-into-scope bug". Local CI would pass due the `$env.NU_LOG_<...>` variables being inherited from the calling process, but would fail in the "clean" GitHub CI environment. These tests have also been fixed. * Added additional tests for the changes # After Submitting Will update the Standard Library doc page	2024-10-03 06:28:22 -05:00
Ian Manske	905ec88091	Update PR template (#12838 ) # Description Updates the command listed in the PR template to test the standard library, following from #11151.	2024-05-13 08:45:44 -05:00
Ian Manske	1038c64f80	Add `sys` subcommands (#12747 ) # Description Adds subcommands to `sys` corresponding to each column of the record returned by `sys`. This is to alleviate the fact that `sys` now returns a regular record, meaning that it must compute every column which might take a noticeable amount of time. The subcommands, on the other hand, only need to compute and return a subset of the data which should be much faster. In fact, it should be as fast as before, since this is how the lazy record worked (it would compute only each column as necessary). I choose to add subcommands instead of having an optional cell-path parameter on `sys`, since the cell-path parameter would: - increase the code complexity (can access any value at any row or nested column) - prevents discovery with tab-completion - hinders type checking and allows users to pass potentially invalid columns # User-Facing Changes Deprecates `sys` in favor of the new `sys` subcommands.	2024-05-06 23:20:27 +00:00
Devyn Cairns	0884d1a5ce	Fix testing.nu import of std log (#12392 ) # Description `use std/log.nu` does not work, have to `use std log` # User-Facing Changes Fix the testing script. Bug fix.	2024-04-05 20:29:19 -05:00
Wind	f7d647ac3c	`open`, `rm`, `umv`, `cp`, `rm` and `du`: Don't globs if inputs are variables or string interpolation (#11886 ) # Description This is a follow up to https://github.com/nushell/nushell/pull/11621#issuecomment-1937484322 Also Fixes: #11838 ## About the code change It applys the same logic when we pass variables to external commands: `0487e9ffcb/crates/nu-command/src/system/run_external.rs (L162-L170)` That is: if user input dynamic things(like variables, sub-expression, or string interpolation), it returns a quoted `NuPath`, then user input won't be globbed # User-Facing Changes Given two input files: `ac.txt`, `abc.txt` `let f = "ac.txt"; rm $f` will remove one file: `ac.txt`. ~* `let f = "ac.txt"; rm --glob $f` will remove `ac.txt` and `abc.txt`~ * `let f: glob = "ac.txt"; rm $f` will remove `ac.txt` and `abc.txt` ## Rules about globbing with variable Given two files: `ac.txt`, `abc.txt` \| Cmd Type \| example \| Result \| \| ----- \| ------------------ \| ------ \| \| builtin \| let f = "ac.txt"; rm $f \| remove `ac.txt` \| \| builtin \| let f: glob = "ac.txt"; rm $f \| remove `ac.txt` and `abc.txt` \| builtin \| let f = "ac.txt"; rm ($f \\| into glob) \| remove `ac.txt` and `abc.txt` \| custom \| def crm [f: glob] { rm $f }; let f = "ac.txt"; crm $f \| remove `ac.txt` and `abc.txt` \| custom \| def crm [f: glob] { rm ($f \\| into string) }; let f = "ac.txt"; crm $f \| remove `ac.txt` \| custom \| def crm [f: string] { rm $f }; let f = "ac.txt"; crm $f \| remove `ac.txt` \| custom \| def crm [f: string] { rm $f }; let f = "ac.txt"; crm ($f \\| into glob) \| remove `a*c.txt` and `abc.txt` In general, if a variable is annotated with `glob` type, nushell will expand glob pattern. Or else, we need to use `into \| glob` to expand glob pattern # Tests + Formatting Done # After Submitting I think `str glob-escape` command will be no-longer required. We can remove it.	2024-02-23 09:17:09 +08:00
Antoine Stevan	ef1d70eb67	hide `std testing` (#11331 ) follow-up to - https://github.com/nushell/nushell/pull/11151 > Important > land only between 0.89 and 0.90 # Description this PR hides the `std testing` module from the outside. - moves `nu-std/std/testing.nu` to `nu-std/testing.nu` - removes the module from the standard library list of modules to parse - fixes `toolkit.nu` and the CI # User-Facing Changes `std testing` won't be part of the standard library anymore. # Tests + Formatting # After Submitting	2024-01-25 12:50:07 +02:00

Renamed from crates/nu-std/std/testing.nu (Browse further)

8 commits