In preparation for using wait handles in --on-process-exit events, factor
wait handles into their own wait handle store. Also switch them to
per-process instead of per-job, which is a simplification.
This correctly sets $status when a builtin succeeds but its output fails;
for example if the output is redirected to a file and that write fails.
Fixes#7857
It's not super clear what $SHLVL is useful for, but the current
definition is essentially
"number of shells in the parent processes + 1"
which isn't *super useful*?
Bash's behavior here is a bit weird in that it increments $SHLVL
basically always, but since it auto-execs the last process it will
decrement it again, so in practice it's often not incremented.
E.g.
```
> echo $SHLVL
1
> bash -c 'echo $SHLVL; bash'
2
>> echo $SHLVL
2
```
Both bashes here end up having the same $SHLVL because this is
equivalent to `echo $SHLVL; exec bash`. Running `echo $SHLVL` and then
`bash -c 'echo $SHLVL'` in an interactive bash will have a different
result (1 and 2) because that doesn't *exec* the inner bash.
That's not something we want to get into, so what we do is increment
$SHLVL in every interactive fish. Non-interactive fish will simply
import the existing value.
That means if you had e.g. a bash that runs a fish script that ends up
opening a new fish session, you would have a $SHLVL of *2* - one for the
bash, and one for the inner fish.
We key this off is_interactive_session() (which can also be enabled
via `fish -i`) because it's easy and because `fish -i` is asking for
fish to be, in some form, "interactive".
That means most of the time $SHLVL will be "how many shells am I deep,
how often do I have to `exit`", except for when you specifically asked
for a fish to be "interactive". If that's a problem, we can rethink it.
Fixes#7864.
In preparation for concurrent execution, introduce a
`get_performer_for_builtin` function. This function itself returns a
function, which when called will run the builtin. The idea is that the
function may be called on a background thread (but not in this commit).
Several functions including wgetopt and execve operate on null-terminated
arrays of nul-terminated pointers: a list of pointers to C strings where
the last pointer is null. Prior to this change, each process_t stored its
argv in such an array. This had two problems:
1. It was awkward to work with this type, instead of using std::vector,
etc.
2. The process's arguments would be rearranged by builtins which is
surprising
Our null terminated arrays were built around a fancy type that would copy
input strings and also generate an array of pointers to them, in one big
allocation.
Switch to a new model where we construct an array of pointers over
existing strings. So you can supply a `vector<string>` and now
`null_terminated_array_t` will just make a list of pointers to them. Now
processes can just store their argv in a familiar wcstring_list_t.
This cleans up some exit code processing. Previously a failed exec
would produce exit code 125 unconditionally, while a failed posix_spawn
would produce exit code 1 (!).
With this change, fish reports exit code 126 for not-executable, and 127
for file-not-found. This matches bash.
This change modifies the fish safety check surrounding execve / spawn so
it can run shell scripts having concatenated binary content. We're using
the same safety check as FreeBSD /bin/sh [1] and the Z-shell [5]. POSIX
was recently revised to require this behavior:
"The input file may be of any type, but the initial portion of the
file intended to be parsed according to the shell grammar (XREF to
XSH 2.10.2 Shell Grammar Rules) shall consist of characters and
shall not contain the NUL character. The shell shall not enforce
any line length limits."
"Earlier versions of this standard required that input files to the
shell be text files except that line lengths were unlimited.
However, that was overly restrictive in relation to the fact that
shells can parse a script without a trailing newline, and in
relation to a common practice of concatenating a shell script
ending with an 'exit' or 'exec $command' with a binary data payload
to form a single-file self-extracting archive." [2] [3]
One example use case of such scripts, is the Cosmopolitan C Library [4]
which configuse the GNU Linker to output a polyglot shell+binary format
that runs on Linux / Mac / Windows / FreeBSD / OpenBSD / NetBSD / BIOS.
Fixesjart/cosmopolitan#88
[1] 9a1cd36331
[2] http://austingroupbugs.net/view.php?id=1250
[3] http://austingroupbugs.net/view.php?id=1226#c4394
[4] https://justine.lol/cosmopolitan/index.html
[5] 326d9c203b
dynamic_cast requires rtti to be enabled. Now, this isn't a big
problem, but since this is our only dynamic_cast in the entire
codebase, and it's not serving an important function, we can just
replace it.
See #7764
Prior to this fix, if stdin were explicitly closed, then builtins would
silently fail. For example:
count <&-
would just fail with status 1. Remove this limitation and allow each
builtin to handle a closed stdin how it sees fit.
Prior to this change, if you pipe a builtin to another process, it would
be buffered. With this fix the builtin will write directly to the pipe if
safe (that is, if the other end of the pipe is owned by some external
process that has been launched).
Most builtins do not produce a lot of output so this is somewhat tricky to
reproduce, but it can be done like so:
bash -c 'for i in {1..500}; do echo $i ; sleep .5; done' |
string match --regex '[02468]' |
cat
Here 'string match' is filtering out numbers which contain no even digits.
With this change, the numbers are printed as they come, instead of
buffering all the output.
Note that bcfc54fdaa fixed this for the case where the
builtin outputs to stdout directly. This fix extends it to all pipelines
that include only one fish internal process.
fish isn't quite sure what to do if the user specifies an fd redirection
for builtins. For example `source <&5` could potentially just read from
an arbitrary file descriptor internal to fish, like the history file.
fish has some lame code that tries to detect these, but got the sense
wrong. Fix it so that fd redirections for builtins are restricted to
range 0 through 2.
This concerns how fish prevents its own fds from interfering with
user-defined fd redirections, like `echo hi >&5`. fish has historically
done this by tracking all user defined redirections when running a job,
and ensuring that pipes are not assigned the same fds. However this is
annoying to pass around - it means that we have to thread user-defined
redirections into pipe creation.
Take a page from zsh and just ensure that all pipes we create have fds in
the "high range," which here means at least 10. The primary way to do this
is via the F_DUPFD_CLOEXEC syscall, which also sets CLOEXEC, so we aren't
invoking additional syscalls in the common case. This will free us from
having to track which fds are in user-defined redirections.
Previously we sometimes wanted to access an io_buffer_t to append to it
directly, but that's no longer true; all we really care about is its
separated_buffer_t. Make io_bufferfill_t::finish return the
separated_buffer directly, simplifying call sites. No user visible changes
expected here.
This concerns builtins writing to an io_buffer_t. io_buffer_t is how fish
captures output, especially in command substitutions:
set STUFF (string upper stuff)
Recall that io_buffer_t fills itself by reading from an fd (typically
connected to stdout of the command). However if our command is a builtin,
then we can write to the buffer directly.
Prior to this change, when a builtin anticipated writing to an
io_buffer_t, it would first write into an internal buffer, and then after
the builtin was finished, we would copy it to the io_buffer_t. This was
because we didn't have a polymorphic receiver for builtin output: we
always buffered it and then directed it to the io_buffer_t or file
descriptor or stdout or whatever.
Now that we have polymorphpic io_streams_t, we can notice ahead of time
that the builtin output is destined for an internal buffer and have it
just write directly to that buffer. This saves a buffering step, which is
a nice simplification.
Prior to this change, the functions in exec.cpp would return true or false
and it was not clear what significance that value had.
Switch to an enum to make this more explicit. In particular we have the
idea of a "pipeline breaking" error which should us to skip processes
which have not yet launched; if no process launches then we can bail out
to a different path which avoids reaping processes.
Fix an error caused by `exec_job()` assuming a job launched with the
intention of being backgrounded would have a pgid assigned in all cases,
without considering the status of `exec_error` which could have resulted
in the job failing before it was launched into its own process group.
Fixes (but doesn't close) #7423 - that can be closed if this assertion
failure doesn't happen in any released fish versions.
The 'time' prefix may come about either because the job itself is marked
with time, or because of the "inside out" weirdness of 'not time...'.
Factor this logic together and precompute it for a job.
In principle this would allow 'string split' or whatever to output to
stderr and not lose the item separation. In practice this is not used
but it fixes a TODO.
builtins output to stdout and stderr via io_streams_t. Prior to this fix, it
contained an output_stream_t which just wraps a buffer. So all builtin output
went to this buffer (except for eval).
Switch output_stream_t to become a new abstract class which can output to a
buffer, file descriptor, or nowhere. This allows for example `string` to stream
its output as it is produced, instead of buffering it.
This moves us slightly closer towards fish code in the background. The idea is
that a background job may still have "foreground" sub-jobs, example:
begin ; sleep 5 ; end &
The begin/end job runs in the background but should wait for `sleep`.
Prior to this fix, fish would see the overall job group is in the background
and not wait for any of its processes. With this change we detach waiting from
is_foreground.
This changes how fish attempts to protect itself from calling tcsetpgrp() too
aggressively. Recall that tcsetpgrp() will "force" itself, if SIGTTOU is
ignored (which it is in fish when job control is enabled).
Prior to this fix, we avoided SIGTTINs by only transferring the tty ownership
if fish was already the owner. This dated from a time before we had really
nailed down how pgroups should be assigned. Now we more deliberately assign a
job's pgroup so we don't need this conservative check.
However we still need logic to avoid transferring the tty if fish is not the
owner. The bad case is when job control is enabled while fish is running in the
background - here fish would transfer the tty and "steal" from the foreground
process.
So retain the checks of the current tty owner but migrate them to the point of
calling tcsetpgrp() itself.
Assigning the tty is really a function of a job group, not an individual
job. Reflect that in terminal_maybe_give_to_job_group and also
terminal_return_from_job_group.
Prior to this change, the posix_spawn code paths used a fair amount of
manual management around its allocated structures (attrs and file actions).
Encapsulate this into a new class that manages memory management and error
handling.
Initially I wanted to pick a different name to avoid confusion with
process groups, but really job trees *are* process groups. So name them
to reflect that fact.
Also rename "placeholder" to "internal" which is clearer.
Prior to this, jobs all had a pgid, and fish has to work hard to ensure
that pgids were inherited properly for nested jobs. But now the job tree
is the source of truth and there is only one location for the pgid.
job_lineage was used to track "where jobs came from" but the job tree idea is
a better abstraction. It groups jobs together similar to how a process group
would in other shells. Begin to remove the notion of lineage.