# Description
Allows `polars str-lengths` to be used as an expression:
<img width="826" alt="Screenshot 2024-09-04 at 13 57 45"
src="https://github.com/user-attachments/assets/b74139e0-e8ba-4910-84c2-cf4be4a084b6">
# User-Facing Changes
- `polars str-lengths` can be used as an expression.
- char length is now the default. Use the --bytes flag to get bytes
length.
Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.4.0 to
2.5.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/indexmap-rs/indexmap/blob/master/RELEASES.md">indexmap's
changelog</a>.</em></p>
<blockquote>
<h2>2.5.0</h2>
<ul>
<li>Added an <code>insert_before</code> method to <code>IndexMap</code>
and <code>IndexSet</code>, as an
alternative to <code>shift_insert</code> with different behavior on
existing entries.</li>
<li>Added <code>first_entry</code> and <code>last_entry</code> methods
to <code>IndexMap</code>.</li>
<li>Added <code>From</code> implementations between
<code>IndexedEntry</code> and <code>OccupiedEntry</code>.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="48ed49017c"><code>48ed490</code></a>
Release 2.5.0</li>
<li><a
href="139d7addfb"><code>139d7ad</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/340">#340</a>
from cuviper/insert-bounds</li>
<li><a
href="1d9b5e3d03"><code>1d9b5e3</code></a>
Add doc examples for <code>insert_before</code> and
<code>shift_insert</code></li>
<li><a
href="8ca01b0df7"><code>8ca01b0</code></a>
Use <code>insert_before</code> for "new" entries in
<code>insert_sorted</code></li>
<li><a
href="7224def010"><code>7224def</code></a>
Add <code>insert_before</code> as an alternate to
<code>shift_insert</code></li>
<li><a
href="0247a1555d"><code>0247a15</code></a>
Document and assert index bounds in <code>shift_insert</code></li>
<li><a
href="922c6ad1af"><code>922c6ad</code></a>
Update the CI badge</li>
<li><a
href="e482e1768a"><code>e482e17</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/342">#342</a>
from cuviper/btree-like</li>
<li><a
href="b63e4a1556"><code>b63e4a1</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/341">#341</a>
from cuviper/from-entry</li>
<li><a
href="264e5b7304"><code>264e5b7</code></a>
Add doc aliases like <code>BTreeMap</code>/<code>BTreeSet</code></li>
<li>Additional commits viewable in <a
href="https://github.com/indexmap-rs/indexmap/compare/2.4.0...2.5.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=indexmap&package-manager=cargo&previous-version=2.4.0&new-version=2.5.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# Description
Adds the ability for `polars replace` and `polars replace-all` to work
as expressions.
# User-Facing Changes
- `polars replace` can be used with polars expressions
- `polars replace-all` can be used with polars expressions
# Description
The meaning of the word usage is specific to describing how a command
function is *used* and not a synonym for general description. Usage can
be used to describe the SYNOPSIS or EXAMPLES sections of a man page
where the permitted argument combinations are shown or example *uses*
are given.
Let's not confuse people and call it what it is a description.
Our `help` command already creates its own *Usage* section based on the
available arguments and doesn't refer to the description with usage.
# User-Facing Changes
`help commands` and `scope commands` will now use `description` or
`extra_description`
`usage`-> `description`
`extra_usage` -> `extra_description`
Breaking change in the plugin protocol:
In the signature record communicated with the engine.
`usage`-> `description`
`extra_usage` -> `extra_description`
The same rename also takes place for the methods on
`SimplePluginCommand` and `PluginCommand`
# Tests + Formatting
- Updated plugin protocol specific changes
# After Submitting
- [ ] update plugin protocol doc
# Description
Fixes issue [12828](https://github.com/nushell/nushell/issues/12828).
When attempting a `polars collect` on an eager dataframe, we return
dataframe as is. However, before this fix I failed to increment the
internal cache reference count. This caused the value to be dropped from
the internal cache when the references were decremented again.
This fix adds a call to cache.get to increment the value before
returning.
Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.3.0 to
2.4.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/indexmap-rs/indexmap/blob/master/RELEASES.md">indexmap's
changelog</a>.</em></p>
<blockquote>
<h2>2.4.0</h2>
<ul>
<li>Added methods <code>IndexMap::append</code> and
<code>IndexSet::append</code>, moving all items from
one map or set into another, and leaving the original capacity for
reuse.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b66bbfe282"><code>b66bbfe</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/337">#337</a>
from cuviper/append</li>
<li><a
href="a288bf3b3a"><code>a288bf3</code></a>
Add a README note for <code>ordermap</code></li>
<li><a
href="0b2b4b9a78"><code>0b2b4b9</code></a>
Release 2.4.0</li>
<li><a
href="8c0a1cd4be"><code>8c0a1cd</code></a>
Add <code>append</code> methods</li>
<li>See full diff in <a
href="https://github.com/indexmap-rs/indexmap/compare/2.3.0...2.4.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=indexmap&package-manager=cargo&previous-version=2.3.0&new-version=2.4.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# Description
This pull request merges `polars sink` and `polars to-*` into one
command `polars save`.
# User-Facing Changes
- `polars to-*` commands have all been replaced with `polars save`. When
saving a lazy frame to a type that supports a polars sink operation, a
sink operation will be performed. Sink operations are much more
performant, performing a collect while streaming to the file system.
# Description
This exposes the `LazyFrame::sink_*` functionality to allow a streaming
collect directly to the filesystem. This useful when working with data
that is too large to fit into memory.
# User-Facing Changes
- Introduction of the `polars sink` command
# Description
Prior this pull request `polars first` and `polars last` would collect a
lazy frame into an eager frame before performing operations. Now `polars
first` will to a `LazyFrame::limit` and `polars last` will perform a
`LazyFrame::tail`. This is really useful in working with very large
datasets.
# Description
When opening a dataframe the default operation will be to create a lazy
frame if possible. This works much better with large datasets and
supports hive format.
# User-Facing Changes
- `--lazy` is nolonger a valid option. `--eager` must be used to
explicitly open an eager dataframe.
Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.2.6 to
2.3.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/indexmap-rs/indexmap/blob/master/RELEASES.md">indexmap's
changelog</a>.</em></p>
<blockquote>
<h2>2.3.0</h2>
<ul>
<li>Added trait <code>MutableEntryKey</code> for opt-in mutable access
to map entry keys.</li>
<li>Added method <code>MutableKeys::iter_mut2</code> for opt-in mutable
iteration of map
keys and values.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="22c0b4e0f3"><code>22c0b4e</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/335">#335</a>
from epage/mut</li>
<li><a
href="39f7cc097a"><code>39f7cc0</code></a>
Release 2.3.0</li>
<li><a
href="6049d518a0"><code>6049d51</code></a>
feat(map): Add MutableKeys::iter_mut2</li>
<li><a
href="65c3c46e37"><code>65c3c46</code></a>
feat(map): Add MutableEntryKey</li>
<li><a
href="7f7d39f734"><code>7f7d39f</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/332">#332</a>
from waywardmonkeys/missing-indentation-in-doc-comment</li>
<li><a
href="8222a59a85"><code>8222a59</code></a>
Fix missing indentation in doc comment.</li>
<li><a
href="1a71dde63d"><code>1a71dde</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/327">#327</a>
from waywardmonkeys/dep-update-dev-dep-itertools</li>
<li><a
href="ac2a8a5a40"><code>ac2a8a5</code></a>
deps(dev): Update <code>itertools</code></li>
<li>See full diff in <a
href="https://github.com/indexmap-rs/indexmap/compare/2.2.6...2.3.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=indexmap&package-manager=cargo&previous-version=2.2.6&new-version=2.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx
you can also mention related issues, PRs or discussions!
-->
Per discussion on
[Discord](https://discord.com/channels/601130461678272522/864228801851949077/1265718178927870045)
# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.
Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->
To facilitate column-oriented dataframe construction, this PR added a
`--as-columns` flag to `polars into-df` command so that when specified,
and when input shape is record of lists, each list will be treated as a
column rather than a cell value, i.e. `{a: [1 3], b: [2 4]} | polars
into-df --as-columns` returns the same dataframe as `[[a b];[1 2] [3 4]]
| polars into-df`
# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
A new flag `--as-columns`, no change of semantics if this flag is
unspecified.
# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
---------
Co-authored-by: Ben Yang <ben@ya.ng>
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.9.1 to 1.10.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/uuid-rs/uuid/releases">uuid's
releases</a>.</em></p>
<blockquote>
<h2>1.10.0</h2>
<h2>Deprecations</h2>
<p>This release deprecates and renames the following functions:</p>
<ul>
<li><code>Builder::from_rfc4122_timestamp</code> ->
<code>Builder::from_gregorian_timestamp</code></li>
<li><code>Builder::from_sorted_rfc4122_timestamp</code> ->
<code>Builder::from_sorted_gregorian_timestamp</code></li>
<li><code>Timestamp::from_rfc4122</code> ->
<code>Timestamp::from_gregorian</code></li>
<li><code>Timestamp::to_rfc4122</code> ->
<code>Timestamp::to_gregorian</code></li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>Use const identifier in uuid macro by <a
href="https://github.com/Vrajs16"><code>@Vrajs16</code></a> in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/764">uuid-rs/uuid#764</a></li>
<li>Rename most methods referring to RFC4122 by <a
href="https://github.com/Mikopet"><code>@Mikopet</code></a> / <a
href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/765">uuid-rs/uuid#765</a></li>
<li>prepare for 1.10.0 release by <a
href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/766">uuid-rs/uuid#766</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/Vrajs16"><code>@Vrajs16</code></a> made
their first contribution in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/764">uuid-rs/uuid#764</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/uuid-rs/uuid/compare/1.9.1...1.10.0">https://github.com/uuid-rs/uuid/compare/1.9.1...1.10.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4b4c590ae3"><code>4b4c590</code></a>
Merge pull request <a
href="https://redirect.github.com/uuid-rs/uuid/issues/766">#766</a> from
uuid-rs/cargo/1.10.0</li>
<li><a
href="68eff32640"><code>68eff32</code></a>
Merge pull request <a
href="https://redirect.github.com/uuid-rs/uuid/issues/765">#765</a> from
uuid-rs/chore/time-fn-deprecations</li>
<li><a
href="3d5384da4b"><code>3d5384d</code></a>
update docs and deprecation messages for timestamp fns</li>
<li><a
href="de50f2091f"><code>de50f20</code></a>
renaming rfc4122 functions</li>
<li><a
href="4a8841792a"><code>4a88417</code></a>
prepare for 1.10.0 release</li>
<li><a
href="66b4fcef14"><code>66b4fce</code></a>
Merge pull request <a
href="https://redirect.github.com/uuid-rs/uuid/issues/764">#764</a> from
Vrajs16/main</li>
<li><a
href="8896e26c42"><code>8896e26</code></a>
Use expr instead of ident</li>
<li><a
href="09973d6aff"><code>09973d6</code></a>
Added changes</li>
<li><a
href="6edf3e8cd5"><code>6edf3e8</code></a>
Use const identifer in uuid macro</li>
<li>See full diff in <a
href="https://github.com/uuid-rs/uuid/compare/1.9.1...1.10.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=uuid&package-manager=cargo&previous-version=1.9.1&new-version=1.10.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# Description
Makes `polars unpivot` use the same arguments as `polars pivot` and
makes it consistent with the polars' rust api. Additionally, support for
the polar's streaming engine has been exposed on eager dataframes.
Previously, it would only work with lazy dataframes.
# User-Facing Changes
* `polars unpivot` argument `--columns`|`-c` has been renamed to
`--index`|`-i`
* `polars unpivot` argument `--values`|`-v` has been renamed to
`--on`|`-o`
* `polars unpivot` short argument for `--streamable` is now `-t` to make
it consistent with `polars pivot`. It was made `-t` for `polars pivot`
because `-s` is short for `--short`
There was a bug where anytime the plugin cache remove was called, the
plugin gc was turned back on. This probably happened when I added the
reference counter logic.
# Description
Upgrading to Polars 0.41
# User-Facing Changes
* `polars melt` has been renamed to `polars unpivot` to match the change
in the polars API. Additionally, it now supports lazy dataframes.
Introduced a `--streamable` option to use the polars streaming engine
for lazy frames.
* The parameter `outer` has been replaced with `full` in `polars join`
to match polars change.
* `polars value-count` now supports the column (rename count column),
parallelize (multithread), sort, and normalize options.
The list of polars changes can be found
[here](https://github.com/pola-rs/polars/releases/tag/rs-0.41.2)
In this pull request, I converted the `perf` function within `nu_utils`
to a macro. This change facilitates easier usage within plugins by
allowing the use of `env_logger` and setting `RUST_LOG=nu_plugin_polars`
(or another plugin). Without this conversion, the `RUST_LOG` variable
would need to be set to `RUST_LOG=nu_utils::utils`, which is less
intuitive and impossible to narrow the perf results to one plugin.
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx
you can also mention related issues, PRs or discussions!
-->
# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.
Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->
# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
Addresses performance issues that @maxim-uvarov found with CSV and JSON
lines.
This ensures that the schema inference follows the polars defaults of
100 lines. Recent changes caused the default values to be override and
caused the entire file to be scanned when inferring the schema.
# Description
This allows plugins to report their version (and potentially other
metadata in the future). The version is shown in `plugin list` and in
`version`.
The metadata is stored in the registry file, and reflects whatever was
retrieved on `plugin add`, not necessarily the running binary. This can
help you to diagnose if there's some kind of mismatch with what you
expect. We could potentially use this functionality to show a warning or
error if a plugin being run does not have the same version as what was
in the cache file, suggesting `plugin add` be run again, but I haven't
done that at this point.
It is optional, and it requires the plugin author to make some code
changes if they want to provide it, since I can't automatically
determine the version of the calling crate or anything tricky like that
to do it.
Example:
```
> plugin list | select name version is_running pid
╭───┬────────────────┬─────────┬────────────┬─────╮
│ # │ name │ version │ is_running │ pid │
├───┼────────────────┼─────────┼────────────┼─────┤
│ 0 │ example │ 0.93.1 │ false │ │
│ 1 │ gstat │ 0.93.1 │ false │ │
│ 2 │ inc │ 0.93.1 │ false │ │
│ 3 │ python_example │ 0.1.0 │ false │ │
╰───┴────────────────┴─────────┴────────────┴─────╯
```
cc @maxim-uvarov (he asked for it)
# User-Facing Changes
- `plugin list` gets a `version` column
- `version` shows plugin versions when available
- plugin authors *should* add `fn metadata()` to their `impl Plugin`,
but don't have to
# Tests + Formatting
Tested the low level stuff and also the `plugin list` column.
# After Submitting
- [ ] update plugin guide docs
- [ ] update plugin protocol docs (`Metadata` call & response)
- [ ] update plugin template (`fn metadata()` should be easy)
- [ ] release notes
This allows performance debugging to be turned on by setting:
```nushell
$env.POLARS_PLUGIN_PERF = "true"
```
Furthermore, this improves the other plugin debugging by allowing the
env variable for debugging to be set at any time versus having to be
available when nushell is launched:
```nushell
$env.POLARS_PLUGIN_DEBUG = "true"
```
This plugin introduces a `perf` function that will output timing
results. This works very similar to the perf function available in
nu_utils::utils::perf. This version prints everything to std error to
not break the plugin stream and uses the engine interface to see if the
env variable is configured.
This pull requests uses this `perf` function when:
* opening csv files as dataframes
* opening json lines files as dataframes
This will hopefully help provide some more fine grained information on
how long it takes polars to open different dataframes. The `perf` can
also be utilized later for other dataframes use cases.
Per discussion on discord dataframes channel with @maxim-uvarov and pyz.
When converting a dataframe to an nushell value via `polars into-nu`,
the index column should not be added by default and should only be added
when specifying `--index`
As reported by @maxim-uvarov and pyz in the dataframes discord channel:
```nushell
[[a b]; [1 1] [1 2] [2 1] [2 2] [3 1] [3 2]] | polars into-df | polars with-column ((polars col a) / (polars col b)) --name c
× Type mismatch.
╭─[entry #45:1:102]
1 │ [[a b]; [1 1] [1 2] [2 1] [2 2] [3 1] [3 2]] | polars into-df | polars with-column ((polars col a) / (polars col b)) --name c
· ───────┬──────
· ╰── Right hand side not a dataframe expression
╰────
```
This pull request corrects the type casting on the right hand side and
allows more than just polars literal expressions.
# Description
@maxim-uvarov did a ton of research and work with the dply-rs author and
ritchie from polars and found out that the allocator matters on macos
and it seems to be what was messing up the performance of polars plugin.
ritchie suggested to use jemalloc but i switched it to mimalloc to match
nushell and it seems to run better.
## Before (default allocator)
note - using 1..10 vs 1..100 since it takes so long. also notice how
high the `max` timings are compared to mimalloc below.
```nushell
❯ 1..10 | each {timeit {polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null}} | | {mean: ($in | math avg), min: ($in | math min), max: ($in | math max), stddev: ($in | into int | into float | math stddev | into int | $'($in)ns' | into duration)}
╭────────┬─────────────────────────╮
│ mean │ 4sec 999ms 605µs 995ns │
│ min │ 983ms 627µs 42ns │
│ max │ 13sec 398ms 135µs 791ns │
│ stddev │ 3sec 476ms 479µs 939ns │
╰────────┴─────────────────────────╯
❯ use std bench
❯ bench { polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null } -n 10
╭───────┬────────────────────────╮
│ mean │ 6sec 220ms 783µs 983ns │
│ min │ 1sec 184ms 997µs 708ns │
│ max │ 18sec 882ms 81µs 708ns │
│ std │ 5sec 350ms 375µs 697ns │
│ times │ [list 10 items] │
╰───────┴────────────────────────╯
```
## After (using mimalloc)
```nushell
❯ 1..100 | each {timeit {polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null}} | | {mean: ($in | math avg), min: ($in | math min), max: ($in | math max), stddev: ($in | into int | into float | math stddev | into int | $'($in)ns' | into duration)}
╭────────┬───────────────────╮
│ mean │ 103ms 728µs 902ns │
│ min │ 97ms 107µs 42ns │
│ max │ 149ms 430µs 84ns │
│ stddev │ 5ms 690µs 664ns │
╰────────┴───────────────────╯
❯ use std bench
❯ bench { polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null } -n 100
╭───────┬───────────────────╮
│ mean │ 103ms 620µs 195ns │
│ min │ 97ms 541µs 166ns │
│ max │ 130ms 262µs 166ns │
│ std │ 4ms 948µs 654ns │
│ times │ [list 100 items] │
╰───────┴───────────────────╯
```
## After (using jemalloc - just for comparison)
```nushell
❯ 1..100 | each {timeit {polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null}} | | {mean: ($in | math avg), min: ($in | math min), max: ($in | math max), stddev: ($in | into int | into float | math stddev | into int | $'($in)ns' | into duration)}
╭────────┬───────────────────╮
│ mean │ 113ms 939µs 777ns │
│ min │ 108ms 337µs 333ns │
│ max │ 166ms 467µs 458ns │
│ stddev │ 6ms 175µs 618ns │
╰────────┴───────────────────╯
❯ use std bench
❯ bench { polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null } -n 100
╭───────┬───────────────────╮
│ mean │ 114ms 363µs 530ns │
│ min │ 108ms 804µs 833ns │
│ max │ 143ms 521µs 459ns │
│ std │ 5ms 88µs 56ns │
│ times │ [list 100 items] │
╰───────┴───────────────────╯
```
## After (using parquet + mimalloc)
```nushell
❯ 1..100 | each {timeit {polars open data.parquet | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null}} | | {mean: ($in | math avg), min: ($in | math min), max: ($in | math max), stddev: ($in | into int | into float | math stddev | into int | $'($in)ns' | into duration)}
╭────────┬──────────────────╮
│ mean │ 34ms 255µs 492ns │
│ min │ 31ms 787µs 250ns │
│ max │ 76ms 408µs 416ns │
│ stddev │ 4ms 472µs 916ns │
╰────────┴──────────────────╯
❯ use std bench
❯ bench { polars open data.parquet | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null } -n 100
╭───────┬──────────────────╮
│ mean │ 34ms 897µs 562ns │
│ min │ 31ms 518µs 542ns │
│ max │ 65ms 943µs 625ns │
│ std │ 3ms 450µs 741ns │
│ times │ [list 100 items] │
╰───────┴──────────────────╯
```
# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
This reverts commit 68adc4657f.
# Description
Reverts the lazyframe refactor (#12669) for the next release, since
there are still a few lingering issues. This temporarily solves #12863
and #12828. After the release, the lazyframes can be added back and
cleaned up.
# Description
Removes the old `nu-cmd-dataframe` crate in favor of the polars plugin.
As such, this PR also removes the `dataframe` feature, related CI, and
full releases of nushell.
# Description
This PR adds a few functions to `Span` for merging spans together:
- `Span::append`: merges two spans that are known to be in order.
- `Span::concat`: returns a span that encompasses all the spans in a
slice. The spans must be in order.
- `Span::merge`: merges two spans (no order necessary).
- `Span::merge_many`: merges an iterator of spans into a single span (no
order necessary).
These are meant to replace the free-standing `nu_protocol::span`
function.
The spans in a `LiteCommand` (the `parts`) should always be in order
based on the lite parser and lexer. So, the parser code sees the most
usage of `Span::append` and `Span::concat` where the order is known. In
other code areas, `Span::merge` and `Span::merge_many` are used since
the order between spans is often not known.
# Description
This PR introduces a `ByteStream` type which is a `Read`-able stream of
bytes. Internally, it has an enum over three different byte stream
sources:
```rust
pub enum ByteStreamSource {
Read(Box<dyn Read + Send + 'static>),
File(File),
Child(ChildProcess),
}
```
This is in comparison to the current `RawStream` type, which is an
`Iterator<Item = Vec<u8>>` and has to allocate for each read chunk.
Currently, `PipelineData::ExternalStream` serves a weird dual role where
it is either external command output or a wrapper around `RawStream`.
`ByteStream` makes this distinction more clear (via `ByteStreamSource`)
and replaces `PipelineData::ExternalStream` in this PR:
```rust
pub enum PipelineData {
Empty,
Value(Value, Option<PipelineMetadata>),
ListStream(ListStream, Option<PipelineMetadata>),
ByteStream(ByteStream, Option<PipelineMetadata>),
}
```
The PR is relatively large, but a decent amount of it is just repetitive
changes.
This PR fixes#7017, fixes#10763, and fixes#12369.
This PR also improves performance when piping external commands. Nushell
should, in most cases, have competitive pipeline throughput compared to,
e.g., bash.
| Command | Before (MB/s) | After (MB/s) | Bash (MB/s) |
| -------------------------------------------------- | -------------:|
------------:| -----------:|
| `throughput \| rg 'x'` | 3059 | 3744 | 3739 |
| `throughput \| nu --testbin relay o> /dev/null` | 3508 | 8087 | 8136 |
# User-Facing Changes
- This is a breaking change for the plugin communication protocol,
because the `ExternalStreamInfo` was replaced with `ByteStreamInfo`.
Plugins now only have to deal with a single input stream, as opposed to
the previous three streams: stdout, stderr, and exit code.
- The output of `describe` has been changed for external/byte streams.
- Temporary breaking change: `bytes starts-with` no longer works with
byte streams. This is to keep the PR smaller, and `bytes ends-with`
already does not work on byte streams.
- If a process core dumped, then instead of having a `Value::Error` in
the `exit_code` column of the output returned from `complete`, it now is
a `Value::Int` with the negation of the signal number.
# After Submitting
- Update docs and book as necessary
- Release notes (e.g., plugin protocol changes)
- Adapt/convert commands to work with byte streams (high priority is
`str length`, `bytes starts-with`, and maybe `bytes ends-with`).
- Refactor the `tee` code, Devyn has already done some work on this.
---------
Co-authored-by: Devyn Cairns <devyn.cairns@gmail.com>