I had previously changed NuLazyFrame::collect to set the NuDataFrame's
from_lazy field to false to prevent conversion back to a lazy frame. It
appears there are cases where this should happen. Instead, I am only
setting from_lazy=false inside the `polars collect` command.
[Related discord
message](https://discord.com/channels/601130461678272522/1227612017171501136/1230600465159421993)
Co-authored-by: Jack Wright <jack.wright@disqo.com>
# Description
This is just some cleanup. I moved to_pipeline_data and to_cache_value
to the CustomValueSupport trait, where I should've put them to begin
with.
Co-authored-by: Jack Wright <jack.wright@disqo.com>
# Description
@maxim-uvarov discovered the following error:
```
> [[a b]; [6 2] [1 4] [4 1]] | polars into-lazy | polars sort-by a | polars unique --subset [a]
Error: × Error using as series
╭─[entry #1:1:68]
1 │ [[a b]; [6 2] [1 4] [4 1]] | polars into-lazy | polars sort-by a | polars unique --subset [a]
· ──────┬──────
· ╰── dataframe has more than one column
╰────
```
During investigation, I discovered the root cause was that the lazy frame was incorrectly converted back to a eager dataframe. In order to keep this from happening, I explicitly set that the dataframe did not come from an eager frame. This causes the conversion logic to not attempt to convert the dataframe later in the pipeline.
---------
Co-authored-by: Jack Wright <jack.wright@disqo.com>
# Description
This adds a `SharedCow` type as a transparent copy-on-write pointer that
clones to unique on mutate.
As an initial test, the `Record` within `Value::Record` is shared.
There are some pretty big wins for performance. I'll post benchmark
results in a comment. The biggest winner is nested access, as that would
have cloned the records for each cell path follow before and it doesn't
have to anymore.
The reusability of the `SharedCow` type is nice and I think it could be
used to clean up the previous work I did with `Arc` in `EngineState`.
It's meant to be a mostly transparent clone-on-write that just clones on
`.to_mut()` or `.into_owned()` if there are actually multiple
references, but avoids cloning if the reference is unique.
# User-Facing Changes
- `Value::Record` field is a different type (plugin authors)
# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
# After Submitting
- [ ] use for `EngineState`
- [ ] use for `Value::List`
# Description
From @maxim-uvarov's
[post](https://discord.com/channels/601130461678272522/1227612017171501136/1228656319704203375).
When calling `to-lazy` back to back in a pipeline, an error should not
occur:
```
> [[a b]; [6 2] [1 4] [4 1]] | polars into-lazy | polars into-lazy
Error: nu:🐚:cant_convert
× Can't convert to NuDataFrame.
╭─[entry #1:1:30]
1 │ [[a b]; [6 2] [1 4] [4 1]] | polars into-lazy | polars into-lazy
· ────────┬───────
· ╰── can't convert NuLazyFrameCustomValue to NuDataFrame
╰────
```
This pull request ensures that custom value's of NuLazyFrameCustomValue are properly converted when passed in.
Co-authored-by: Jack Wright <jack.wright@disqo.com>
# Description
@maxim-uvarov discovered an issue with the current implementation. When
executing [[index a]; [1 1]] | polars into-df, a plugin_failed_to_decode
error occurs. This happens because a Record is created with two columns
named "index" as an index column is added during conversion. This pull
request addresses the problem by not adding an index column if there is
already a column named "index" in the dataframe.
---------
Co-authored-by: Jack Wright <jack.wright@disqo.com>
# Description
All polars commands that output a file were not handling relative paths
correctly.
A command like
``` [[a b]; [6 2] [1 4] [4 1]] | polars into-df | polars to-parquet foo.json```
was outputting the foo.json to the directory of the plugin executable.
This pull request pulls in nu-path and using it for resolving the file paths.
Related discussion
https://discord.com/channels/601130461678272522/1227612017171501136/1227889870358183966
# User-Facing Changes
None
# Tests + Formatting
Done, added tests for each of the polars to-* commands.
---------
Co-authored-by: Jack Wright <jack.wright@disqo.com>
# Description
`polars ls` is already different that `dfr ls`. Currently it just shows
the cache key, columns, rows, and type. I have added:
- creation time
- size
- span contents
- span start and end
<img width="1471" alt="Screenshot 2024-04-10 at 17 27 06"
src="https://github.com/nushell/nushell/assets/56345/545918b7-7c96-4c25-bc01-b9e2b659a408">
# Tests + Formatting
Done
Co-authored-by: Jack Wright <jack.wright@disqo.com>