Closes#9240.
Squash of the following commits (in reverse-chronological order):
commit 03b5cab3dc40eca9d50a9df07a8a32524338a807
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date: Sun Sep 25 15:09:04 2022 -0500
Handle differently declared posix_spawnxxx_t on macOS
On macOS, posix_spawnattr_t and posix_spawn_file_actions_t are declared as void
pointers, so we can't use maybe_t's bool operator to test if it has a value.
commit aed83b8bb308120c0f287814d108b5914593630a
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date: Sun Sep 25 14:48:46 2022 -0500
Update maybe_t tests to reflect dynamic bool conversion
maybe_t<T> is now bool-convertible only if T _isn't_ already bool-convertible.
commit 2b5a12ca97b46f96b1c6b56a41aafcbdb0dfddd6
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date: Sun Sep 25 14:34:03 2022 -0500
Make maybe_t a little harder to misuse
We've had a few bugs over the years stemming from accidental misuse of maybe_t
with bool-convertible types. This patch disables maybe_t's bool operator if the
type T is already bool convertible, forcing the (barely worth mentioning) need
to use maybe_t::has_value() instead.
This patch both removes maybe_t's bool conversion for bool-convertible types and
updates the existing codebase to use the explicit `has_value()` method in place
of existing implicit bool conversions.
Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.
It's been a long time.
One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.
Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
Or should we stop using it?
I'm fine with either always or never using auto-formatting but our current
way of using it only sometimes is confusing.
No functional change.
This is after we've tried to find the interpreter, so we would already
have complained about e.g. /usr/bin/pthyon not existing.
Realistically the most common case here is things that don't start
with a shebang like ELFs. Writing special extraction code here is
overkill, and I can't see a good function to do it for us.
But this should point you in the right direction.
Fixes#8938
If we get an E2BIG while executing a process, we check how large the
exported variables are. We already did this, but then immediately
added it to the total.
So now we keep the tally just for the variables around, and if it's
over half (which is an atypical value if your system has an ARG_MAX of
2MB), we mention that in the error.
Figuring out which variable is too big (in case it's just one) is probably too complicated,
but we can at least complain if things seem suspect.
Untested because I don't know *how* to do so portably
This is a big cleanup to how tty transfer works. Recall that when job
control is active, we transfer the tty to jobs via tcsetpgrp().
Previously, transferring was done "as needed" in continue_job. That is, if
we are running a job, and the job wants the terminal and does not have it,
we will transfer the tty at that point.
This got pretty weird when running mixed pipelines. For example:
cmd1 | func1 | cmd2
Here we would run `func1` before calling continue_job. Thus the tty
would be transferred by the nested function invocation, and also restored
by that invocation, potentially racing with tty manipulation from cmd1 or
cmd2.
In the new model, migrate the tty transfer responsibility outside of
continue_job. The caller of continue_job is then responsible for setting up
the tty. There's two places where this gets done:
1. In `exec_job`, where we run a job for the first time.
2. In `builtin_fg` where we continue a stopped job in the foreground.
Fixes#8699
This is a cleanup of job groups, rationalizing a bunch of stuff. Some
notable changes (none user-visible hopefully):
1. Previously, if a job group wanted a pgid, then we would assign it to the
first process to run in the job group. Now we deliberately mark which
process will own the pgroup, via a new `leads_pgrp` flag in process_t. This
eliminates a source of ambiguity.
2. Previously, if a job were run inside fish's pgroup, we would set fish's
pgroup as the group of the job. But this meant we had to check if the job
had fish's pgroup in lots of places, for example when calling tcsetpgrp.
Now a job group only has a pgrp if that pgrp is external (i.e. the job is
under job control).
Only show the shebang warning for .fish commands.
Use the phrase "interpreter directive" as the formal name for the
shebang.
Switch from windows to Windows for the operating system.
When we execute something and it doesn't have a shebang, typically we
fall back on running it with /bin/sh. For .fish scripts, we still
refuse to do this (assuming that /bin/sh won't handle .fish scripts properly).
Only the error wasn't great. So we now explicitly mention when there's
a missing shebang, and point towards the shebang line otherwise.
If you make a script called `foo` somewhere in $PATH, and did not give
it a shebang, this would end up calling
sh foo
instead of
sh /usr/bin/foo
which might not match up.
Especially if the path is e.g. `--version` or `-` that would end up
being misinterpreted *by sh*.
So instead we simply pass the actual_cmd to sh, because we need it
anyway to get it to fail to execute before.
As seen in
https://stackoverflow.com/questions/70139844/how-to-execute-custom-fish-scripts-in-custom-path-folder,
making a shebang like
#!usr/bin/fish
won't work, and will error with the default "file does not exist"
error *pointing to the file, not the interpreter*.
Detect that interpreter properly.
We might want to make this an even more specific error, but now it
says
```
exec: Failed to execute process '/home/alfa/.local/bin/borken.fish': The file specified the interpreter 'usr/bin/fish', which is not an executable command.
```
Which is okay.
* Remove safe_strerror, safe_perror and safe_append
This no longer works on new glibcs because they removed sys_errlist.
So just hardcode the relevant errno messages (and phrase them better).
Fixes#4183.
Co-authored-by: Johannes Altmanninger <aclopte@gmail.com>
FISH_USE_POSIX_SPAWN is always defined, thanks to the line
#define FISH_USE_POSIX_SPAWN HAVE_SPAWN_H
So replace #ifdef with #if to fix compilation on platforms lacking
spawn.h. Also make the spawn.h inclusion condition consistent across
files.
This change modifies the fish safety check surrounding execve / spawn so
it can run shell scripts having concatenated binary content. We're using
the same safety check as FreeBSD /bin/sh [1] and the Z-shell [5]. POSIX
was recently revised to require this behavior:
"The input file may be of any type, but the initial portion of the
file intended to be parsed according to the shell grammar (XREF to
XSH 2.10.2 Shell Grammar Rules) shall consist of characters and
shall not contain the NUL character. The shell shall not enforce
any line length limits."
"Earlier versions of this standard required that input files to the
shell be text files except that line lengths were unlimited.
However, that was overly restrictive in relation to the fact that
shells can parse a script without a trailing newline, and in
relation to a common practice of concatenating a shell script
ending with an 'exit' or 'exec $command' with a binary data payload
to form a single-file self-extracting archive." [2] [3]
One example use case of such scripts, is the Cosmopolitan C Library [4]
which configuse the GNU Linker to output a polyglot shell+binary format
that runs on Linux / Mac / Windows / FreeBSD / OpenBSD / NetBSD / BIOS.
Fixesjart/cosmopolitan#88
[1] 9a1cd36331
[2] http://austingroupbugs.net/view.php?id=1250
[3] http://austingroupbugs.net/view.php?id=1226#c4394
[4] https://justine.lol/cosmopolitan/index.html
[5] 326d9c203b
This may slightly improve performance by allowing the compiler greater
visibility into what is happing on top of not executing at runtime in
some hot paths, but more importantly, it gets rid of magic constants in a
few different places.
This changes how fish attempts to protect itself from calling tcsetpgrp() too
aggressively. Recall that tcsetpgrp() will "force" itself, if SIGTTOU is
ignored (which it is in fish when job control is enabled).
Prior to this fix, we avoided SIGTTINs by only transferring the tty ownership
if fish was already the owner. This dated from a time before we had really
nailed down how pgroups should be assigned. Now we more deliberately assign a
job's pgroup so we don't need this conservative check.
However we still need logic to avoid transferring the tty if fish is not the
owner. The bad case is when job control is enabled while fish is running in the
background - here fish would transfer the tty and "steal" from the foreground
process.
So retain the checks of the current tty owner but migrate them to the point of
calling tcsetpgrp() itself.
Prior to this change, the posix_spawn code paths used a fair amount of
manual management around its allocated structures (attrs and file actions).
Encapsulate this into a new class that manages memory management and error
handling.
Initially I wanted to pick a different name to avoid confusion with
process groups, but really job trees *are* process groups. So name them
to reflect that fact.
Also rename "placeholder" to "internal" which is clearer.
Prior to this, jobs all had a pgid, and fish has to work hard to ensure
that pgids were inherited properly for nested jobs. But now the job tree
is the source of truth and there is only one location for the pgid.
I kinda hate how fussy clang-format is. It reflows text
constantly (line limit), forces things onto one line *except* when
they're too long, and wants to turn this:
```c++
return true;;
```
into this:
```c++
return true;
;
```
instead of, you know, eliminating the second semicolon?
Anyway, it is what it is and we use it, I'll just look into getting some
more slack.
This PR is aimed at improving how job ids are assigned. In particular,
previous to this commit, a job id would be consumed by functions (and
thus aliases). Since it's usual to use functions as command wrappers
this results in awkward job id assignments.
For example if the user is like me and just made the jump from vim -> neovim
then the user might create the following alias:
```
alias vim=nvim
```
Previous to this commit if the user ran `vim` after setting up this
alias, backgrounded (^Z) and ran `jobs` then the output might be:
```
Job Group State Command
2 60267 stopped nvim $argv
```
If the user subsequently opened another vim (nvim) session, backgrounded
and ran jobs then they might see what follows:
```
Job Group State Command
4 70542 stopped nvim $argv
2 60267 stopped nvim $argv
```
These job ids feel unnatural, especially when transitioning away from
e.g. bash where job ids are sequentially incremented (and aliases/functions
don't consume a job id).
See #6053 for more details.
As @ridiculousfish pointed out in
https://github.com/fish-shell/fish-shell/issues/6053#issuecomment-559899400,
we want to elide a job's job id if it corresponds to a single function in the
foreground. This translates to the following prerequisites:
- A job must correspond to a single process (i.e. the job continuation
must be empty)
- A job must be in the foreground (i.e. `&` wasn't appended)
- The job's single process must resolve to a function invocation
If all of these conditions are true then we should mark a job as
"internal" and somehow remove it from consideration when any
infrastructure tries to interact with jobs / job ids.
I saw two paths to implement these requirements:
- At the time of job creation calculate whether or not a job is
"internal" and use a separate list of job ids to track their ids.
Additionally introduce a new flag denoting that a job is internal so
that e.g. `jobs` doesn't list internal jobs
- I started implementing this route but quickly realized I was
computing the same information that would be computed later on (e.g.
"is this job a single process" and "is this jobs statement a
function"). Specifically I was computing data that populate_job_process
would end up computing later anyway. Additionally this added some
weird complexities to the job system (after the change there were two
job id lists AND an additional flag that had to be taken into
consideration)
- Once a function is about to be executed we release the current jobs
job id if the prerequisites are satisfied (which at this point have
been fully computed).
- I opted for this solution since it seems cleaner. In this
implementation "releasing a job id" is done by both calling
`release_job_id` and by marking the internal job_id member variable to
-1. The former operation allows subsequent child jobs to reuse that
same job id (so e.g. the situation described in Motivation doesn't
occur), and the latter ensures that no other job / job id
infrastructure will interact with these jobs because valid jobs have
positive job ids. The second operation causes job_id to become
non-const which leads to the list of code changes outside of `exec.c`
(i.e. a codemod from `job_t::job_id` -> `job_t::job_id()` and moving the
old member variable to a non-const private `job_t::job_id_`)
Note: Its very possible I missed something and setting the job id to -1
will break some other infrastructure, please let me know if so!
I tried to run `make/ninja lint`, but a bunch of non-relevant issues
appeared (e.g. `fatal error: 'config.h' file not found`). I did
successfully clang-format (`git clang-format -f`) and run tests, though.
This PR closes#6053.
This adds a test for the obscure case where an fd is redirected to
itself. This is tricky because the dup2 will not clear the CLO_EXEC bit.
So do it manually; also posix_spawn can't be used in this case.