fixes#969, admittedly without a --delimiter alias
moves from_structured_data.rs to from_delimited_data.rs to better
identify its scope and adds to_delimited_data.rs. Now csv and tsv both
use the same code, tsv passes in a fixed '\t' argument where csv passes
in the value of --separator
`save` attempts to convert input based on the target filename extension,
and expects a stream of text otherwise. However the error message is
unclear and provides little guidance, hopefully this is less confusing
to new users.
It might be worthwhile to also add a hint about adding an extension,
though I'm not sure if it's possible to emit multiple diagnostics.
tables and able to work with them for data processing & viewing
purposes. At the moment, certain ways to process said tables we
are able to view a histogram of a given column.
As usage matures, we may find certain core commands that could
be used ergonomically when working with tables on Nu.
The original purpose of this PR was to modernize the external parser to
use the new Shape system.
This commit does include some of that change, but a more important
aspect of this change is an improvement to the expansion trace.
Previous commit 6a7c00ea adding trace infrastructure to the syntax coloring
feature. This commit adds tracing to the expander.
The bulk of that work, in addition to the tree builder logic, was an
overhaul of the formatter traits to make them more general purpose, and
more structured.
Some highlights:
- `ToDebug` was split into two traits (`ToDebug` and `DebugFormat`)
because implementations needed to become objects, but a convenience
method on `ToDebug` didn't qualify
- `DebugFormat`'s `fmt_debug` method now takes a `DebugFormatter` rather
than a standard formatter, and `DebugFormatter` has a new (but still
limited) facility for structured formatting.
- Implementations of `ExpandSyntax` need to produce output that
implements `DebugFormat`.
Unlike the highlighter changes, these changes are fairly focused in the
trace output, so these changes aren't behind a flag.
a joy. Fundamentally we embrace functional programming principles for
transforming the dataset from any format picked up by Nu. This table
processing "primitive" commands will build up and make pipelines
composable with data processing capabilities allowing us the valuate,
reduce, and map, the tables as far as even composing this declartively.
On this regard, `split-by` expects some table with grouped data and we
can use it further in interesting ways (Eg. collecting labels for
visualizing the data in charts and/or suit it for a particular chart
of our interest).
Previously it would split the last column on the first separator value found
between the start of the column and the end of the row. Changing this to using
everything from the start of the column to the end of the string makes it behave
more similarly to the other columns, making it less surprising.
The table parsing/creation logic has changed from treating every line the same
to processing each line in context of the column header's placement. Previously,
lines on separate rows would go towards the same column as long as they were the
same index based on separator alone. Now, each item's index is based on vertical
alignment to the column header.
This may seem brittle, but it solves the problem of some tables operating with
empty cells that would cause remaining values to be paired with the wrong
column.
Based on kubernetes output (get pods, events), the new method has shown to have
much greater success rates for parsing.
New tests are added to test for additional cases that might be trickier to
handle with the new logic.
Old tests are updated where their expectations are no longer expected to hold true.
For instance: previously, lines would be treated separately, allowing any index
offset between columns on different rows, as long as they had the same row index
as decided by a separator. When this is no longer the case, some things need to
be adjusted.