nushell/src/test_bins.rs

302 lines
8.2 KiB
Rust
Raw Normal View History

pipe binary data to external commands (#8058) Fixes #7615 # Description When calling external commands, we create a table from the pipeline data to handle external commands expecting paginated input. When a binary value is made into a table, we convert the vector of bytes representing the binary bytes into a pretty formatted string. This results in the pretty formatted string being sent to external commands instead of the actual binary bytes. By checking whether the stdout of the call is being redirected, we can decide whether to send the raw binary bytes or the pretty formatted output when creating a table command. # User-Facing Changes When passing binary values to external commands, the external command will receive the actual bytes instead of the pretty printed string. Use cases that don't involve piping a binary value into an external command are unchanged. ![new_behavior](https://user-images.githubusercontent.com/32406734/218349172-24cd12f2-d563-4957-bdf1-6aa804b174b2.png) # Tests + Formatting Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: cargo fmt --all -- --check to check standard code formatting (cargo fmt --all applies these changes) cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A clippy::needless_collect to check that you're using the standard code style cargo test --workspace to check that all tests pass # After Submitting If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date.
2023-02-24 20:39:52 +00:00
use std::io::{self, BufRead, Read, Write};
use nu_command::create_default_context;
use nu_command::hook::{eval_env_change_hook, eval_hook};
use nu_engine::eval_block;
use nu_parser::parse;
use nu_protocol::engine::{EngineState, Stack, StateWorkingSet};
use nu_protocol::{CliError, PipelineData, Value};
// use nu_test_support::fs::in_directory;
/// Echo's value of env keys from args
/// Example: nu --testbin env_echo FOO BAR
/// If it it's not present echo's nothing
pub fn echo_env(to_stdout: bool) {
let args = args();
for arg in args {
if let Ok(v) = std::env::var(arg) {
if to_stdout {
println!("{v}");
} else {
eprintln!("{v}");
}
}
}
}
pub fn cococo() {
let args: Vec<String> = args();
if args.len() > 1 {
// Write back out all the arguments passed
// if given at least 1 instead of chickens
// speaking co co co.
println!("{}", &args[1..].join(" "));
} else {
println!("cococo");
}
}
pub fn meow() {
let args: Vec<String> = args();
for arg in args.iter().skip(1) {
let contents = std::fs::read_to_string(arg).expect("Expected a filepath");
println!("{contents}");
}
}
// A binary version of meow
pub fn meowb() {
let args: Vec<String> = args();
let stdout = io::stdout();
let mut handle = stdout.lock();
for arg in args.iter().skip(1) {
let buf = std::fs::read(arg).expect("Expected a filepath");
handle.write_all(&buf).expect("failed to write to stdout");
}
}
// Relays anything received on stdin to stdout
pub fn relay() {
io::copy(&mut io::stdin().lock(), &mut io::stdout().lock())
.expect("failed to copy stdin to stdout");
}
pub fn nonu() {
args().iter().skip(1).for_each(|arg| print!("{arg}"));
}
pub fn repeater() {
let mut stdout = io::stdout();
let args = args();
let mut args = args.iter().skip(1);
let letter = args.next().expect("needs a character to iterate");
let count = args.next().expect("need the number of times to iterate");
let count: u64 = count.parse().expect("can't convert count to number");
for _ in 0..count {
let _ = write!(stdout, "{letter}");
}
let _ = stdout.flush();
}
special-case ExternalStream in bytes starts-with (#8203) # Description `bytes starts-with` converts the input into a `Value` before running .starts_with to find if the binary matches. This has two side effects: it makes the code simpler, only dealing in whole values, and simplifying a lot of input pipeline handling and value transforming it would otherwise have to do. _Especially_ in the presence of a cell path to drill into. It also makes buffers the entire input into memory, which can take up a lot of memory when dealing with large files, especially if you only want to check the first few bytes (like for a magic number). This PR adds a special branch on PipelineData::ExternalStream with a streaming version of starts_with. # User-Facing Changes Opening large files and running bytes starts-with on them will not take a long time. # Tests + Formatting Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A clippy::needless_collect` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass # Drawbacks Streaming checking is more complicated, and there may be bugs. I tested it with multiple chunks with string data and binary data and it seems to work alright up to 8k and over bytes, though. The existing `operate` method still exists because the way it handles cell paths and values is complicated. This causes some "code duplication", or at least some intent duplication, between the value code and the streaming code. This might be worthwhile considering the performance gains (approaching infinity on larger inputs). Another thing to consider is that my ExternalStream branch considers string data as valid input. The operate branch only parses Binary values, so it would fail. `open` is kind of unpredictable on whether it returns string data or binary data, even when passing `--raw`. I think this can be a problem but not really one I'm trying to tackle in this PR, so, it's worth considering.
2023-02-26 14:17:44 +00:00
// A version of repeater that can output binary data, even null bytes
pub fn repeat_bytes() {
let mut stdout = io::stdout();
let args = args();
let mut args = args.iter().skip(1);
while let (Some(binary), Some(count)) = (args.next(), args.next()) {
let bytes: Vec<u8> = (0..binary.len())
.step_by(2)
.map(|i| {
u8::from_str_radix(&binary[i..i + 2], 16)
.expect("binary string is valid hexadecimal")
})
.collect();
let count: u64 = count.parse().expect("repeat count must be a number");
for _ in 0..count {
stdout
.write_all(&bytes)
.expect("writing to stdout must not fail");
}
}
let _ = stdout.flush();
}
pub fn iecho() {
// println! panics if stdout gets closed, whereas writeln gives us an error
let mut stdout = io::stdout();
let _ = args()
.iter()
.skip(1)
.cycle()
.try_for_each(|v| writeln!(stdout, "{v}"));
}
pub fn fail() {
std::process::exit(1);
}
pub fn chop() {
if did_chop_arguments() {
// we are done and don't care about standard input.
std::process::exit(0);
}
// if no arguments given, chop from standard input and exit.
let stdin = io::stdin();
let mut stdout = io::stdout();
for given in stdin.lock().lines().flatten() {
let chopped = if given.is_empty() {
&given
} else {
let to = given.len() - 1;
&given[..to]
};
if let Err(_e) = writeln!(stdout, "{chopped}") {
break;
}
}
std::process::exit(0);
}
fn outcome_err(
engine_state: &EngineState,
error: &(dyn miette::Diagnostic + Send + Sync + 'static),
) -> ! {
let working_set = StateWorkingSet::new(engine_state);
eprintln!("Error: {:?}", CliError(error, &working_set));
std::process::exit(1);
}
fn outcome_ok(msg: String) -> ! {
println!("{msg}");
std::process::exit(0);
}
pub fn nu_repl() {
//cwd: &str, source_lines: &[&str]) {
let cwd = std::env::current_dir().expect("Could not get current working directory.");
let source_lines = args();
let mut engine_state = create_default_context();
let mut stack = Stack::new();
stack.add_env_var("PWD".to_string(), Value::test_string(cwd.to_string_lossy()));
let mut last_output = String::new();
for (i, line) in source_lines.iter().enumerate() {
let cwd = nu_engine::env::current_dir(&engine_state, &stack)
.unwrap_or_else(|err| outcome_err(&engine_state, &err));
// Before doing anything, merge the environment from the previous REPL iteration into the
// permanent state.
if let Err(err) = engine_state.merge_env(&mut stack, &cwd) {
outcome_err(&engine_state, &err);
}
// Check for pre_prompt hook
let config = engine_state.get_config();
if let Some(hook) = config.hooks.pre_prompt.clone() {
if let Err(err) = eval_hook(&mut engine_state, &mut stack, None, vec![], &hook) {
outcome_err(&engine_state, &err);
}
}
// Check for env change hook
let config = engine_state.get_config();
if let Err(err) = eval_env_change_hook(
config.hooks.env_change.clone(),
&mut engine_state,
&mut stack,
) {
outcome_err(&engine_state, &err);
}
// Check for pre_execution hook
let config = engine_state.get_config();
*engine_state
.repl_buffer_state
.lock()
.expect("repl buffer state mutex") = line.to_string();
if let Some(hook) = config.hooks.pre_execution.clone() {
if let Err(err) = eval_hook(&mut engine_state, &mut stack, None, vec![], &hook) {
outcome_err(&engine_state, &err);
}
}
// Eval the REPL line
let (block, delta) = {
let mut working_set = StateWorkingSet::new(&engine_state);
let block = parse(
&mut working_set,
Some(&format!("line{i}")),
line.as_bytes(),
false,
);
if let Some(err) = working_set.parse_errors.first() {
outcome_err(&engine_state, err);
}
(block, working_set.render())
};
if let Err(err) = engine_state.merge_delta(delta) {
outcome_err(&engine_state, &err);
}
let input = PipelineData::empty();
let config = engine_state.get_config();
match eval_block(&engine_state, &mut stack, &block, input, false, false) {
Ok(pipeline_data) => match pipeline_data.collect_string("", config) {
Ok(s) => last_output = s,
Err(err) => outcome_err(&engine_state, &err),
},
Err(err) => outcome_err(&engine_state, &err),
}
if let Some(cwd) = stack.get_env_var(&engine_state, "PWD") {
let path = cwd
.as_string()
.unwrap_or_else(|err| outcome_err(&engine_state, &err));
let _ = std::env::set_current_dir(path);
engine_state.add_env_var("PWD".into(), cwd);
}
}
outcome_ok(last_output)
}
fn did_chop_arguments() -> bool {
let args: Vec<String> = args();
if args.len() > 1 {
let mut arguments = args.iter();
arguments.next();
for arg in arguments {
let chopped = if arg.is_empty() {
arg
} else {
let to = arg.len() - 1;
&arg[..to]
};
println!("{chopped}");
}
return true;
}
false
}
pipe binary data to external commands (#8058) Fixes #7615 # Description When calling external commands, we create a table from the pipeline data to handle external commands expecting paginated input. When a binary value is made into a table, we convert the vector of bytes representing the binary bytes into a pretty formatted string. This results in the pretty formatted string being sent to external commands instead of the actual binary bytes. By checking whether the stdout of the call is being redirected, we can decide whether to send the raw binary bytes or the pretty formatted output when creating a table command. # User-Facing Changes When passing binary values to external commands, the external command will receive the actual bytes instead of the pretty printed string. Use cases that don't involve piping a binary value into an external command are unchanged. ![new_behavior](https://user-images.githubusercontent.com/32406734/218349172-24cd12f2-d563-4957-bdf1-6aa804b174b2.png) # Tests + Formatting Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: cargo fmt --all -- --check to check standard code formatting (cargo fmt --all applies these changes) cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A clippy::needless_collect to check that you're using the standard code style cargo test --workspace to check that all tests pass # After Submitting If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date.
2023-02-24 20:39:52 +00:00
pub fn input_bytes_length() {
let stdin = io::stdin();
let count = stdin.lock().bytes().count();
println!("{}", count);
}
fn args() -> Vec<String> {
// skip (--testbin bin_name args)
std::env::args().skip(2).collect()
}