fish-shell/src/parse_execution.h

189 lines
8.9 KiB
C++
Raw Normal View History

// Provides the "linkage" between an ast and actual execution structures (job_t, etc.).
#ifndef FISH_PARSE_EXECUTION_H
#define FISH_PARSE_EXECUTION_H
2015-07-25 15:14:25 +00:00
#include <stddef.h>
#include <vector>
#include "ast.h" // IWYU pragma: keep
2015-07-25 15:14:25 +00:00
#include "common.h"
#include "io.h"
#include "maybe.h"
2015-07-25 15:14:25 +00:00
#include "parse_constants.h"
#include "parse_tree.h"
#include "proc.h"
#include "redirection.h"
2019-05-19 21:44:17 +00:00
class block_t;
class operation_context_t;
2015-07-25 15:14:25 +00:00
class parser_t;
/// An eval_result represents evaluation errors including wildcards which failed to match, syntax
/// errors, or other expansion errors. It also tracks when evaluation was skipped due to signal
/// cancellation. Note it does not track the exit status of commands.
enum class end_execution_reason_t {
/// Evaluation was successfull.
ok,
/// Evaluation was skipped due to control flow (break or return).
control_flow,
/// Evaluation was cancelled, e.g. because of a signal or exit.
cancelled,
/// A parse error or failed expansion (but not an error exit status from a command).
error,
};
class parse_execution_context_t : noncopyable_t {
private:
Port AST to Rust The translation is fairly direct though it adds some duplication, for example there are multiple "match" statements that mimic function overloading. Rust has no overloading, and we cannot have generic methods in the Node trait (due to a Rust limitation, the error is like "cannot be made into an object") so we include the type name in method names. Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor") that takes care of the AST traversal while the AST consumption remains in C++ for now. In future, "IndentVisitor" should absorb the entirety of "indent_visitor_t". This pattern requires that "fish_indent" be exposed includable header to the CXX bridge. Alternatively, we could define FFI wrappers for recursive AST traversal. Rust requires we separate the AST visitors for "mut" and "const" scenarios. Take this opportunity to concretize both visitors: The only client that requires mutable access is the populator. To match the structure of the C++ populator which makes heavy use of function overloading, we need to add a bunch of functions to the trait. Since there is no other mutable visit, this seems acceptable. The "const" visitors never use "will_visit_fields_of()" or "did_visit_fields_of()", so remove them (though this is debatable). Like in the C++ implementation, the AST nodes themselves are largely defined via macros. Union fields like "Statement" and "ArgumentOrRedirection" do currently not use macros but may in future. This commit also introduces a precedent for a type that is defined in one CXX bridge and used in another one - "ParseErrorList". To make this work we need to manually define "ExternType". There is one annoyance with CXX: functions that take explicit lifetime parameters require to be marked as unsafe. This makes little sense because functions that return `&Foo` with implicit lifetime can be misused the same way on the C++ side. One notable change is that we cannot directly port "find_block_open_keyword()" (which is used to compute an error) because it relies on the stack of visited nodes. We cannot modify a stack of node references while we do the "mut" walk. Happily, an idiomatic solution is easy: we can tell the AST visitor to backtrack to the parent node and create the error there. Since "node_t::accept_base" is no longer a template we don't need the "node_visitation_t" trampoline anymore. The added copying at the FFI boundary makes things slower (memcpy dominates the profile) but it's not unusable, which is good news: $ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'" Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish' Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms] Range (min … max): 193.2 ms … 205.1 ms 15 runs Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish' Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms] Range (min … max): 611.7 ms … 805.5 ms 10 runs Summary './fish.old -c 'source ../share/completions/git.fish'' ran 3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish'' Leftovers: - Enum variants are still snakecase; I didn't get around to changing this yet. - "ast_type_to_string()" still returns a snakecase name. This could be changed since it's not user visible.
2023-04-02 14:42:59 +00:00
rust::Box<parsed_source_ref_t> pstree;
parser_t *const parser;
const operation_context_t &ctx;
// If set, one of our processes received a cancellation signal (INT or QUIT) so we are
// unwinding.
int cancel_signal{0};
// The currently executing job node, used to indicate the line number.
const ast::job_pipeline_t *executing_job_node{};
// Cached line number information.
size_t cached_lineno_offset = 0;
int cached_lineno_count = 0;
/// The block IO chain.
/// For example, in `begin; foo ; end < file.txt` this would have the 'file.txt' IO.
io_chain_t block_io{};
// Check to see if we should end execution.
// \return the eval result to end with, or none() to continue on.
// This will never return end_execution_reason_t::ok.
maybe_t<end_execution_reason_t> check_end_execution() const;
// Report an error, setting $status to \p status. Always returns
// 'end_execution_reason_t::error'.
end_execution_reason_t report_error(int status, const ast::node_t &node, const wchar_t *fmt,
...) const;
end_execution_reason_t report_errors(int status, const parse_error_list_t &error_list) const;
/// Command not found support.
end_execution_reason_t handle_command_not_found(const wcstring &cmd,
const ast::decorated_statement_t &statement,
int err_code);
2014-03-31 17:01:39 +00:00
// Utilities
wcstring get_source(const ast::node_t &node) const;
const ast::decorated_statement_t *infinite_recursive_statement_in_job_list(
const ast::job_list_t &jobs, wcstring *out_func_name) const;
// Expand a command which may contain variables, producing an expand command and possibly
// arguments. Prints an error message on error.
end_execution_reason_t expand_command(const ast::decorated_statement_t &statement,
wcstring *out_cmd, std::vector<wcstring> *out_args) const;
/// Indicates whether a job is a simple block (one block, no redirections).
bool job_is_simple_block(const ast::job_pipeline_t &job) const;
enum process_type_t process_type_for_command(const ast::decorated_statement_t &statement,
const wcstring &cmd) const;
end_execution_reason_t apply_variable_assignments(
process_t *proc, const ast::variable_assignment_list_t &variable_assignment_list,
const block_t **block);
// These create process_t structures from statements.
end_execution_reason_t populate_job_process(
job_t *job, process_t *proc, const ast::statement_t &statement,
const ast::variable_assignment_list_t &variable_assignments_list_t);
end_execution_reason_t populate_not_process(job_t *job, process_t *proc,
const ast::not_statement_t &not_statement);
end_execution_reason_t populate_plain_process(process_t *proc,
const ast::decorated_statement_t &statement);
template <typename Type>
end_execution_reason_t populate_block_process(process_t *proc,
const ast::statement_t &statement,
const Type &specific_statement);
// These encapsulate the actual logic of various (block) statements.
end_execution_reason_t run_block_statement(const ast::block_statement_t &statement,
const block_t *associated_block);
end_execution_reason_t run_for_statement(const ast::for_header_t &header,
const ast::job_list_t &contents);
end_execution_reason_t run_if_statement(const ast::if_statement_t &statement,
const block_t *associated_block);
end_execution_reason_t run_switch_statement(const ast::switch_statement_t &statement);
end_execution_reason_t run_while_statement(const ast::while_header_t &header,
const ast::job_list_t &contents,
const block_t *associated_block);
end_execution_reason_t run_function_statement(const ast::block_statement_t &statement,
const ast::function_header_t &header);
end_execution_reason_t run_begin_statement(const ast::job_list_t &contents);
enum globspec_t { failglob, nullglob };
using ast_args_list_t = std::vector<const ast::argument_t *>;
static ast_args_list_t get_argument_nodes(const ast::argument_list_t &args);
static ast_args_list_t get_argument_nodes(const ast::argument_or_redirection_list_t &args);
end_execution_reason_t expand_arguments_from_nodes(const ast_args_list_t &argument_nodes,
std::vector<wcstring> *out_arguments,
globspec_t glob_behavior);
// Determines the list of redirections for a node.
end_execution_reason_t determine_redirections(const ast::argument_or_redirection_list_t &list,
redirection_spec_list_t *out_redirections);
2023-02-24 23:54:46 +00:00
end_execution_reason_t run_1_job(const ast::job_pipeline_t &job,
const block_t *associated_block);
end_execution_reason_t test_and_run_1_job_conjunction(const ast::job_conjunction_t &jc,
const block_t *associated_block);
end_execution_reason_t run_job_conjunction(const ast::job_conjunction_t &job_expr,
const block_t *associated_block);
end_execution_reason_t run_job_list(const ast::job_list_t &job_list_node,
const block_t *associated_block);
end_execution_reason_t run_job_list(const ast::andor_job_list_t &job_list_node,
const block_t *associated_block);
end_execution_reason_t populate_job_from_job_node(job_t *j, const ast::job_pipeline_t &job_node,
const block_t *associated_block);
// Assign a job group to the given job.
void setup_group(job_t *j);
// \return whether we should apply job control to our processes.
bool use_job_control() const;
// Returns the line number of the node. Not const since it touches cached_lineno_offset.
int line_offset_of_node(const ast::job_pipeline_t *node);
int line_offset_of_character_at_offset(size_t offset);
public:
/// Construct a context in preparation for evaluating a node in a tree, with the given block_io.
/// The execution context may access the parser and parent job group (if any) through ctx.
Port AST to Rust The translation is fairly direct though it adds some duplication, for example there are multiple "match" statements that mimic function overloading. Rust has no overloading, and we cannot have generic methods in the Node trait (due to a Rust limitation, the error is like "cannot be made into an object") so we include the type name in method names. Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor") that takes care of the AST traversal while the AST consumption remains in C++ for now. In future, "IndentVisitor" should absorb the entirety of "indent_visitor_t". This pattern requires that "fish_indent" be exposed includable header to the CXX bridge. Alternatively, we could define FFI wrappers for recursive AST traversal. Rust requires we separate the AST visitors for "mut" and "const" scenarios. Take this opportunity to concretize both visitors: The only client that requires mutable access is the populator. To match the structure of the C++ populator which makes heavy use of function overloading, we need to add a bunch of functions to the trait. Since there is no other mutable visit, this seems acceptable. The "const" visitors never use "will_visit_fields_of()" or "did_visit_fields_of()", so remove them (though this is debatable). Like in the C++ implementation, the AST nodes themselves are largely defined via macros. Union fields like "Statement" and "ArgumentOrRedirection" do currently not use macros but may in future. This commit also introduces a precedent for a type that is defined in one CXX bridge and used in another one - "ParseErrorList". To make this work we need to manually define "ExternType". There is one annoyance with CXX: functions that take explicit lifetime parameters require to be marked as unsafe. This makes little sense because functions that return `&Foo` with implicit lifetime can be misused the same way on the C++ side. One notable change is that we cannot directly port "find_block_open_keyword()" (which is used to compute an error) because it relies on the stack of visited nodes. We cannot modify a stack of node references while we do the "mut" walk. Happily, an idiomatic solution is easy: we can tell the AST visitor to backtrack to the parent node and create the error there. Since "node_t::accept_base" is no longer a template we don't need the "node_visitation_t" trampoline anymore. The added copying at the FFI boundary makes things slower (memcpy dominates the profile) but it's not unusable, which is good news: $ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'" Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish' Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms] Range (min … max): 193.2 ms … 205.1 ms 15 runs Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish' Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms] Range (min … max): 611.7 ms … 805.5 ms 10 runs Summary './fish.old -c 'source ../share/completions/git.fish'' ran 3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish'' Leftovers: - Enum variants are still snakecase; I didn't get around to changing this yet. - "ast_type_to_string()" still returns a snakecase name. This could be changed since it's not user visible.
2023-04-02 14:42:59 +00:00
parse_execution_context_t(rust::Box<parsed_source_ref_t> pstree, const operation_context_t &ctx,
io_chain_t block_io);
2014-03-31 17:01:39 +00:00
/// Returns the current line number, indexed from 1. Not const since it touches
/// cached_lineno_offset.
int get_current_line_number();
/// Returns the source offset, or -1.
int get_current_source_offset() const;
/// Returns the source string.
Port AST to Rust The translation is fairly direct though it adds some duplication, for example there are multiple "match" statements that mimic function overloading. Rust has no overloading, and we cannot have generic methods in the Node trait (due to a Rust limitation, the error is like "cannot be made into an object") so we include the type name in method names. Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor") that takes care of the AST traversal while the AST consumption remains in C++ for now. In future, "IndentVisitor" should absorb the entirety of "indent_visitor_t". This pattern requires that "fish_indent" be exposed includable header to the CXX bridge. Alternatively, we could define FFI wrappers for recursive AST traversal. Rust requires we separate the AST visitors for "mut" and "const" scenarios. Take this opportunity to concretize both visitors: The only client that requires mutable access is the populator. To match the structure of the C++ populator which makes heavy use of function overloading, we need to add a bunch of functions to the trait. Since there is no other mutable visit, this seems acceptable. The "const" visitors never use "will_visit_fields_of()" or "did_visit_fields_of()", so remove them (though this is debatable). Like in the C++ implementation, the AST nodes themselves are largely defined via macros. Union fields like "Statement" and "ArgumentOrRedirection" do currently not use macros but may in future. This commit also introduces a precedent for a type that is defined in one CXX bridge and used in another one - "ParseErrorList". To make this work we need to manually define "ExternType". There is one annoyance with CXX: functions that take explicit lifetime parameters require to be marked as unsafe. This makes little sense because functions that return `&Foo` with implicit lifetime can be misused the same way on the C++ side. One notable change is that we cannot directly port "find_block_open_keyword()" (which is used to compute an error) because it relies on the stack of visited nodes. We cannot modify a stack of node references while we do the "mut" walk. Happily, an idiomatic solution is easy: we can tell the AST visitor to backtrack to the parent node and create the error there. Since "node_t::accept_base" is no longer a template we don't need the "node_visitation_t" trampoline anymore. The added copying at the FFI boundary makes things slower (memcpy dominates the profile) but it's not unusable, which is good news: $ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'" Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish' Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms] Range (min … max): 193.2 ms … 205.1 ms 15 runs Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish' Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms] Range (min … max): 611.7 ms … 805.5 ms 10 runs Summary './fish.old -c 'source ../share/completions/git.fish'' ran 3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish'' Leftovers: - Enum variants are still snakecase; I didn't get around to changing this yet. - "ast_type_to_string()" still returns a snakecase name. This could be changed since it's not user visible.
2023-04-02 14:42:59 +00:00
const wcstring &get_source() const { return pstree->src(); }
/// Return the parsed ast.
Port AST to Rust The translation is fairly direct though it adds some duplication, for example there are multiple "match" statements that mimic function overloading. Rust has no overloading, and we cannot have generic methods in the Node trait (due to a Rust limitation, the error is like "cannot be made into an object") so we include the type name in method names. Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor") that takes care of the AST traversal while the AST consumption remains in C++ for now. In future, "IndentVisitor" should absorb the entirety of "indent_visitor_t". This pattern requires that "fish_indent" be exposed includable header to the CXX bridge. Alternatively, we could define FFI wrappers for recursive AST traversal. Rust requires we separate the AST visitors for "mut" and "const" scenarios. Take this opportunity to concretize both visitors: The only client that requires mutable access is the populator. To match the structure of the C++ populator which makes heavy use of function overloading, we need to add a bunch of functions to the trait. Since there is no other mutable visit, this seems acceptable. The "const" visitors never use "will_visit_fields_of()" or "did_visit_fields_of()", so remove them (though this is debatable). Like in the C++ implementation, the AST nodes themselves are largely defined via macros. Union fields like "Statement" and "ArgumentOrRedirection" do currently not use macros but may in future. This commit also introduces a precedent for a type that is defined in one CXX bridge and used in another one - "ParseErrorList". To make this work we need to manually define "ExternType". There is one annoyance with CXX: functions that take explicit lifetime parameters require to be marked as unsafe. This makes little sense because functions that return `&Foo` with implicit lifetime can be misused the same way on the C++ side. One notable change is that we cannot directly port "find_block_open_keyword()" (which is used to compute an error) because it relies on the stack of visited nodes. We cannot modify a stack of node references while we do the "mut" walk. Happily, an idiomatic solution is easy: we can tell the AST visitor to backtrack to the parent node and create the error there. Since "node_t::accept_base" is no longer a template we don't need the "node_visitation_t" trampoline anymore. The added copying at the FFI boundary makes things slower (memcpy dominates the profile) but it's not unusable, which is good news: $ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'" Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish' Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms] Range (min … max): 193.2 ms … 205.1 ms 15 runs Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish' Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms] Range (min … max): 611.7 ms … 805.5 ms 10 runs Summary './fish.old -c 'source ../share/completions/git.fish'' ran 3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish'' Leftovers: - Enum variants are still snakecase; I didn't get around to changing this yet. - "ast_type_to_string()" still returns a snakecase name. This could be changed since it's not user visible.
2023-04-02 14:42:59 +00:00
const ast::ast_t &ast() const { return pstree->ast(); }
/// Start executing at the given node. Returns 0 if there was no error, 1 if there was an
/// error.
end_execution_reason_t eval_node(const ast::statement_t &statement,
const block_t *associated_block);
end_execution_reason_t eval_node(const ast::job_list_t &job_list,
const block_t *associated_block);
};
#endif