Commit graph

22 commits

Author SHA1 Message Date
Bahex
5f7082f053
truly flexible csv/tsv parsing (#14399)
- fixes #14398

I will properly fill out this PR and fix any tests that might break when
I have the time, this was a quick fix.

# Description

This PR makes `from csv` and `from tsv`, with the `--flexible` flag,
stop dropping extra/unexpected columns.

# User-Facing Changes

`$text`'s contents
```csv
value
1,aaa
2,bbb
3
4,ddd
5,eee,extra
```

Old behavior
```nushell
> $text | from csv --flexible --noheaders 
╭─#─┬─column0─╮
│ 0 │ value   │
│ 1 │       1 │
│ 2 │       2 │
│ 3 │       3 │
│ 4 │       4 │
│ 5 │       5 │
╰─#─┴─column0─╯
```

New behavior
```nushell
> $text | from csv --flexible --noheaders 
╭─#─┬─column0─┬─column1─┬─column2─╮
│ 0 │ value   │       │
│ 1 │       1 │ aaa     │       │
│ 2 │       2 │ bbb     │       │
│ 3 │       3 │       │
│ 4 │       4 │ ddd     │       │
│ 5 │       5 │ eee     │ extra   │
╰─#─┴─column0─┴─column1─┴─column2─╯
```

- The first line in a csv (or tsv) document no longer limits the number
of columns
- Missing values in columns are longer automatically filled with `null`
with this change, as a later row can introduce new columns. **BREAKING
CHANGE**

Because missing columns are different from empty columns, operations on
possibly missing columns will have to use optional access syntax e.g.
`get foo` => `get foo?`
  
# Tests + Formatting
Added examples that run as tests and adjusted existing tests to confirm
the new behavior.

# After Submitting

Update the workaround with fish completer mentioned
[here](https://www.nushell.sh/cookbook/external_completers.html#fish-completer)
2024-11-21 15:58:31 -06:00
goldfish
ee74ec7423
Make the subcommands (from {csv, tsv, ssv}) 0-based for consistency (#13209)
# Description

fixed #11678

The sub-commands of from command (`from {csv, tsv, ssv}`) name columns
starting from index 0.
This behaviour is inconsistent with other commands such as `detect
columns`.
This PR makes the subcommands index 0-based.

# User-Facing Changes

The subcommands (`from {csv, tsv, ssv}`) return a table with the columns
starting at index 0 if no header data is passed.

```
~/Development/nushell> "foo bar baz" | from ssv -n -m 1 
╭───┬─────────┬─────────┬─────────╮
│ # │ column0 │ column1 │ column2 │
├───┼─────────┼─────────┼─────────┤
│ 0 │ foo     │ bar     │ baz     │
╰───┴─────────┴─────────┴─────────╯
~/Development/nushell> "foo,bar,baz" | from csv -n 
╭───┬─────────┬─────────┬─────────╮
│ # │ column0 │ column1 │ column2 │
├───┼─────────┼─────────┼─────────┤
│ 0 │ foo     │ bar     │ baz     │
╰───┴─────────┴─────────┴─────────╯
~/Development/nushell> "foo\tbar\tbaz" | from tsv -n
╭───┬─────────┬─────────┬─────────╮
│ # │ column0 │ column1 │ column2 │
├───┼─────────┼─────────┼─────────┤
│ 0 │ foo     │ bar     │ baz     │
╰───┴─────────┴─────────┴─────────╯
```


# Tests + Formatting

When I ran tests, `commands::touch::change_file_mtime_to_reference`
failed with the following error.
The error also occurs in the master branch, so it's probably unrelated
to these changes.
(maybe a problem with my dev environment)

```
$ toolkit check pr

~~~~~~~~

failures:

---- commands::touch::change_file_mtime_to_reference stdout ----
=== stderr

thread 'commands::touch::change_file_mtime_to_reference' panicked at crates/nu-command/tests/commands/touch.rs:298:9:
assertion `left == right` failed
  left: SystemTime { tv_sec: 1719149697, tv_nsec: 57576929 }
 right: SystemTime { tv_sec: 1719149697, tv_nsec: 78219489 }


failures:
    commands::touch::change_file_mtime_to_reference

test result: FAILED. 1533 passed; 1 failed; 32 ignored; 0 measured; 0 filtered out; finished in 10.87s

error: test failed, to rerun pass `-p nu-command --test main`
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🔴 `toolkit test`
-  `toolkit test stdlib`
```

# After Submitting

nothing
2024-06-26 17:51:47 -05:00
Ian Manske
6fd854ed9f
Replace ExternalStream with new ByteStream type (#12774)
# Description
This PR introduces a `ByteStream` type which is a `Read`-able stream of
bytes. Internally, it has an enum over three different byte stream
sources:
```rust
pub enum ByteStreamSource {
    Read(Box<dyn Read + Send + 'static>),
    File(File),
    Child(ChildProcess),
}
```

This is in comparison to the current `RawStream` type, which is an
`Iterator<Item = Vec<u8>>` and has to allocate for each read chunk.

Currently, `PipelineData::ExternalStream` serves a weird dual role where
it is either external command output or a wrapper around `RawStream`.
`ByteStream` makes this distinction more clear (via `ByteStreamSource`)
and replaces `PipelineData::ExternalStream` in this PR:
```rust
pub enum PipelineData {
    Empty,
    Value(Value, Option<PipelineMetadata>),
    ListStream(ListStream, Option<PipelineMetadata>),
    ByteStream(ByteStream, Option<PipelineMetadata>),
}
```

The PR is relatively large, but a decent amount of it is just repetitive
changes.

This PR fixes #7017, fixes #10763, and fixes #12369.

This PR also improves performance when piping external commands. Nushell
should, in most cases, have competitive pipeline throughput compared to,
e.g., bash.
| Command | Before (MB/s) | After (MB/s) | Bash (MB/s) |
| -------------------------------------------------- | -------------:|
------------:| -----------:|
| `throughput \| rg 'x'` | 3059 | 3744 | 3739 |
| `throughput \| nu --testbin relay o> /dev/null` | 3508 | 8087 | 8136 |

# User-Facing Changes
- This is a breaking change for the plugin communication protocol,
because the `ExternalStreamInfo` was replaced with `ByteStreamInfo`.
Plugins now only have to deal with a single input stream, as opposed to
the previous three streams: stdout, stderr, and exit code.
- The output of `describe` has been changed for external/byte streams.
- Temporary breaking change: `bytes starts-with` no longer works with
byte streams. This is to keep the PR smaller, and `bytes ends-with`
already does not work on byte streams.
- If a process core dumped, then instead of having a `Value::Error` in
the `exit_code` column of the output returned from `complete`, it now is
a `Value::Int` with the negation of the signal number.

# After Submitting
- Update docs and book as necessary
- Release notes (e.g., plugin protocol changes)
- Adapt/convert commands to work with byte streams (high priority is
`str length`, `bytes starts-with`, and maybe `bytes ends-with`).
- Refactor the `tee` code, Devyn has already done some work on this.

---------

Co-authored-by: Devyn Cairns <devyn.cairns@gmail.com>
2024-05-16 07:11:18 -07:00
Stefan Holderbach
406df7f208
Avoid taking unnecessary ownership of intermediates (#12740)
# Description

Judiciously try to avoid allocations/clone by changing the signature of
functions

- **Don't pass str by value unnecessarily if only read**
- **Don't require a vec in `Sandbox::with_files`**
- **Remove unnecessary string clone**
- **Fixup unnecessary borrow**
- **Use `&str` in shape color instead**
- **Vec -> Slice**
- **Elide string clone**
- **Elide `Path` clone**
- **Take &str to elide clone in tests**

# User-Facing Changes
None

# Tests + Formatting
This touches many tests purely in changing from owned to borrowed/static
data
2024-05-04 00:53:15 +00:00
Ian Manske
26786a759e
Fix ignored clippy lints (#12160)
# Description
Fixes some ignored clippy lints.

# User-Facing Changes
Changes some signatures and return types to `&dyn Command` instead of
`&Box<dyn Command`, but I believe this is only an internal change.
2024-03-11 19:46:04 +01:00
Darren Schroeder
e93e51d672
bump rust-toolchain to 1.72.1 (#11079)
# Description

This PR follows our process of staying 2 releases behind rust. 1.74.0
was released today so we update to 1.72.1.

Reference https://releases.rs/

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2023-11-16 15:14:45 -06:00
Hudson Clark
cb754befe9
fix: Ensure consistent vals and cols when parsing with --flexible (#10814)
# Description
`from tsv` and `from csv` both support a `--flexible` flag. This flag
can be used to "allow the number of fields in records to be variable".

Previously, a record's invariant that `rec.cols.len() == rec.vals.len()`
could be broken during parsing. This can cause runtime errors as in
#10693. Other commands, like `select` were also affected.

The inconsistencies are somewhat hard to see, as most nushell code
assumes an equal number of columns and values.

# Before

### Fewer values than columns
```nushell
> let record = (echo "one,two\n1" | from csv --flexible | first)
# There are two columns
> $record | columns | to nuon
[one, two]
# But only one value
> $record | values | to nuon
[1]
# And printing the record doesn't show the second column!
> $record | to nuon
{one: 1}
```

### More values than columns
```nushell
> let record = (echo "one,two\n1,2,3" | from csv --flexible | first)
# There are two columns
> $record | columns | to nuon
[one, two]
# But three values
> $record | values | to nuon
[1, 2, 3]
# And printing the record doesn't show the third value!
> $record | to nuon
{one: 1, two: 2}
```
# After

### Fewer values than columns
```nushell
> let record = (echo "one,two\n1" | from csv --flexible | first)
# There are two columns
> $record | columns | to nuon
[one, two]
# And a matching number of values
> $record | values | to nuon
[1, null]
# And printing the record works as expected
> $record | to nuon
{one: 1, two: null}
```

### More values than columns
```nushell
> let record = (echo "one,two\n1,2,3" | from csv --flexible | first)
# There are two columns
> $record | columns | to nuon
[one, two]
# And a matching number of values
> $record | values | to nuon
[1, 2]
# And printing the record works as expected
> $record | to nuon
{one: 1, two: 2}
```

# User-Facing Changes
Using the `--flexible` flag with `from csv` and `from tsv` will not
result in corrupted record state.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2023-10-24 15:54:26 -05:00
Stefan Holderbach
7d6b23ee2f
Simplify rawstrings in tests (#10180)
Inspired by
https://rust-lang.github.io/rust-clippy/master/index.html#/needless_raw_string_hashes

Ran `cargo +stable clippy --workspace --all-targets`

Fixed manually as I ran into a false positive along the lines of:
https://github.com/rust-lang/rust-clippy/issues/11068

Also collapse one set of single line tests.

Work for #8670
2023-09-01 00:08:27 +02:00
Matthias Q
93f20b406e
feat: allow from csv to accept 4 byte unicode separator chars (#10138)
- this PR should close #10132

# Description
* added a flag to `from csv --ascii` that replaces the given `separator
with the unicode separator x1f https://www.codetable.net/hex/1f (aka
Information Separator One)

# User-Facing Changes
New flags are available for `from csv` ( `--ascii` or short `-a`)

# Tests + Formatting
There are no tests at the moment. Code has been formatted.
- `cargo test --workspace` (breaks with a non related test on my
machine)
2023-08-31 18:55:39 +02:00
Stefan Holderbach
656f707a0b
Clean up tests containing unnecessary cwd: tokens (#9692)
# Description
The working directory doesn't have to be set for those tests (or would
be the default anyways). When appropriate also remove calls to the
`pipeline()` function. In most places kept the diff minimal and only
removed the superfluous part to not pollute the blame view. With simpler
tests also simplified things to make them more readable overall (this
included removal of the raw string literal).

Work for #8670
2023-07-17 18:43:51 +02:00
JT
786ba3bf91
Input output checking (#9680)
# Description

This PR tights input/output type-checking a bit more. There are a lot of
commands that don't have correct input/output types, so part of the
effort is updating them.

This PR now contains updates to commands that had wrong input/output
signatures. It doesn't add examples for these new signatures, but that
can be follow-up work.

# User-Facing Changes

BREAKING CHANGE BREAKING CHANGE

This work enforces many more checks on pipeline type correctness than
previous nushell versions. This strictness may uncover incompatibilities
in existing scripts or shortcomings in the type information for internal
commands.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect -A clippy::result_large_err` to check that
you're using the standard code style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2023-07-14 15:20:35 +12:00
Matthew Deville
8543b0789d
Additional flags for commands from csv and from tsv (#8398)
# Description

Resolves issue #8370

Adds the following flags to commands `from csv` and `from tsv`:
- `--flexible`: allow the number of fields in records to be variable
- `-c --comment`: a comment character to ignore lines starting with it
- `-q --quote`: a quote character to ignore separators in strings,
defaults to '\"'
- `-e --escape`: an escape character for strings containing the quote
character

Internally, the `Value` struct has an additional helper function
`as_char` which converts it to a single `char`

# User-Facing Changes

The single quoted string `'\t'` can no longer be used as a parameter for
the flag `--separator '\t'` as it is interpreted as a two-character
string. One needs to use from now on the flag with a double quoted
string like so: `-s "\t"` which correctly interprets the string as a
single `char`.
2023-03-16 17:49:46 -05:00
JT
61455b457d
Fix warnings and old names (#8457)
# Description

This fixes up some clippy warnings and removes some old names/info from
our unit tests

# User-Facing Changes

Internal changes only

# Tests + Formatting

Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```

# After Submitting

If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
2023-03-15 18:54:55 +13:00
Artemiy
b9419e0f36
To csv fix (#7850)
# Description

Fixes #7800 . 
`to csv` and `to tsv` no longer:
- accept anything but records and tables as input,
- accept lists that are not tables,
- accept tables and records with values that are not primitives (other
lists, tables and records).

# User-Facing Changes

Using `to csv` and `to tsv` on any of inputs mentioned above will result
in `cant_convert` error.

# Tests + Formatting

Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass

# After Submitting

If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.

Co-authored-by: Stefan Holderbach <sholderbach@users.noreply.github.com>
2023-01-25 08:42:53 -06:00
pwygab
32fbcf39cc
make first behave same way as last: always return list when with number argument (#6616)
* make `first` behave same way as `last`

* better behaviour

* fix tests

* add tests
2022-09-28 17:08:17 -05:00
Justin Ma
aea4355d04
refactor: change column names from 'Column*' to 'column*' (#4556) 2022-02-19 19:26:47 -05:00
JT
d70d91e559 Remove old nushell/merge engine-q 2022-02-07 14:54:06 -05:00
Fernando Herrera
fdce6c49ab engine-q merge 2022-02-07 19:11:34 +00:00
JT
a008f1aa80
Command tests (#922)
* WIP command tests

* Finish marking todo tests

* update

* update

* Windows cd test ignoring
2022-02-03 21:01:45 -05:00
John-Goff
c13fe83784
Rename count to length (#3166)
* update docs to refer to length instead of count

* rename count to length

* change all occurrences of 'count' to 'length' in tests

* format length command
2021-03-14 10:46:40 +13:00
Saeed Rasooli
42d18d2294
add "-0" as short for --headerless in "from" commands (#3042)
* replace --headerless flags with --noheaders / -n

* Update from_csv.rs

Co-authored-by: Jonathan Turner <jonathandturner@users.noreply.github.com>
2021-02-22 20:25:17 +13:00
Michael Angerman
d06f457b2a
nu-cli refactor moving commands into their own crate nu-command (#2910)
* move commands, futures.rs, script.rs, utils

* move over maybe_print_errors

* add nu_command crate references to nu_cli

* in commands.rs open up to pub mod from pub(crate)

* nu-cli, nu-command, and nu tests are now passing

* cargo fmt

* clean up nu-cli/src/prelude.rs

* code cleanup

* for some reason lex.rs was not formatted, may be causing my error

* remove mod completion from lib.rs which was not being used along with quickcheck macros

* add in allow unused imports

* comment out one failing external test; comment out one failing internal test

* revert commenting out failing tests; something else might be going on; someone with a windows machine should check and see what is going on with these failing windows tests

* Update Cargo.toml

Extend the optional features to nu-command

Co-authored-by: Jonathan Turner <jonathandturner@users.noreply.github.com>
2021-01-12 17:59:53 +13:00
Renamed from crates/nu-cli/tests/format_conversions/csv.rs (Browse further)