Commit ec3d3a481 (Support "$(cmd)" command substitution without line
splitting, 2021-07-02) started treating an input string like
"a$()b" as if it were "a"$()"b". Yet, we do not actually insert the
virtual quotes. Instead we just adapted the definition of when quotes
are closed - hence the changes to quote_end().
parse_util_locate_cmdsubst_range() is aware
of the changes to quote_end() but some of its
callers like parse_util_detect_errors_in_argument() and
highlighter_t::color_as_argument() are not. They split strings at
command substitution boundaries without handling the special quoting
rules. (Only the expansion logic did it right.)
Fix this by handling the special quoting rules inside
parse_util_locate_cmdsubst_range(). This is a bit hacky since it
makes it harder for callers to process some substrings in between
command substitutions, but that's okay because current callers only
care about what's inside the command substitutions.
Fixes#8394
For some reason on a current glibc 2.33, the configure check fails.
The man page says we'd have to define XOPEN_SOURCE>=700, but I don't
want to do that since it changes a bunch of other things, and it
didn't work in my tests.
So we just force it, since we know it works (since glibc 2.3).
This is a performance difference of ~20% for printf, so it's a
reasonably big deal.
Since #4376, for-loops would set the loop variable outside, so it
stays valid.
They did this by doing the equivalent of
```fish
set -l foo $foo
for foo in 1 2 3
```
And that first imaginary `set -l` would also fire a set-event.
Since there's no use for it and the variable isn't actually set, we
remove it.
Fixes#8384.
This was meant to trigger the wcstring_list_t overload by constructing one with `{norm_dir}`. Older gcc can't figure out what to do.
So instead we use the wcstring overload for now.
This used to construct a vector, which was then passed down and filled
with a new event_t each go around the loop. That's useless - we fire
one event here, and it's simply the variable event.
This reduces the overhead of a for-loop by ~10%:
```fish
for i in (seq 100000)
true
end
```
runs in about 90% of the time now.
This reverts commit 61cd05efb0.
It is true that we detect break and continue errors statically, but they can
still be invoked dynamically, example:
set sneaky break
$sneaky # dynamically breaks from the loop
or just `eval break`.
A followup commit will add tests for this.
function_info_t was the "mutable bits" of a function, like its
description. But we have eliminated all of those, so we can eliminate
the class.
No functional change here.
This was already printed by builtin_missing_argument/unknown_option.
Since we need more control (because we add our own errors in other
places), teach builtin_unknown_option to suppress the trailer, like
missing_argument already could.
And then use it.
Fixes#8368.
Now looks like
```
Error: Wrong color in test at index 8-11 in text (expected 0x6, actual 0x2):
command echo abc foo &
^^^^
```
instead of repeating the error for every character that is wrong.
This introduces a new variable, $fish_color_option, that can be used
to highlight options differently.
Options are tokens starting with `-`, but only up to (and including!)
the first `--`.
Fixes#8292.
__GLIBC_PREREQ is the preferred way to conditionally enable features
based on glibc versions. Use it to avoid expensive parsing and
locale sensitivity. See #8204
We want to enable posix_spawn only for glibc >= 2.24, so we check
gnu_get_libc_version() at runtime. This returns a string with the
version number.
Because it's a version number it's spelt with a "." and never a ",",
but we interpret it as a float. This is iffy to begin with, but simple
enough. Only when the locale uses a ",", things break - it'll read it
as "2" and fail the check, which absolutely *tanks* performance on WSL1.
I'm unsure if this gives the proper runtime glibc version - it might,
whereas __GLIBC_MINOR__ and such definitely would not.
So fix the immediate problem by at least using a c locale - this is
already masked by 8dc3982408, but better
safe than sorry.
In most cases, like math, we want C-semantics for floating point
numbers. In particular "." needs to be the decimal separator.
Instead, we pay the price in printf, which is currently the sole place
to output in locale-specific numbers and attempt to read them and
C-style ones.
Similar to `test`, `_` is so likely to at least slow down if not
break all things catastrophically that it ought not be allowed as a
function name. Fixes#8342
This just compares two longs as strings on the go.
We can simply
1. ignore leading zeroes - they have no influence on the value
2. compare the digits char-by-char
3. keep the comparison for the first differing digit
4. if one number is longer than the other, that is larger
5. if the numbers have the same length, the one larger in the first
differing digit is larger
This makes this comparison quite a bit faster, which makes globs in
directories with numbered files up to 20% faster.
Note that, for historical reasons, this still ignores whitespace right
after the numbers!
1ab81ab90d removed one usage of iswdigit()
but there are others; more importantly, the knowledge that iswdigit() is
slow isn't preserved anywhere apart from the git history, so there's
nothing to prevent its use from creeping back into the codebase.
Another alternative is to blacklist iswdigit() (shadow it with a
function of the same name that throws a static_assert) but if we're
going to shadow it anyway, might as well make it useful.
- Only check for HAVE_CLOCK_GETTIME and HAVE_FUTIMENS on Linux, since
they are only used to implement a Linux-specific workaround related
to mtime precision.
- Make sure that hack is limited to Linux builds
- HAVE_SYS_SYSCTL_H was unused, but we should have been using it
- HAVE_TERMIOS_H was unused, remove it
The only functional change is that unix machines with clock_gettime
and futimens will not bother with a Linux-specific hack, and won't
waste time checking for either during cmake configuration either.
For some godforsaken reason it's slow on glibc
Like, actually, this manages to somehow make "echo **" 10% faster now?
The spec says this matches 0 through 9 always, so this is safe. We
also use this logic in a variety of other places already.
This allows us to skip re-wcs2stringing the base_dir again and again
by simply using the fd. It's about 10% faster in my testing.
fstatat is defined by POSIX, so it should be available everywhere.
Also for the glob version, because this is just a performance thing.
Makes `echo **` 20% faster - 100ms to 80ms for the fish repo.
This also applies to the future `path` builtin.
Still not a speed demon, but this is a very very easy win.
Now we probably gotta do globbing all in string instead of wcs2stringing ourselves to death.
Like the $status commit, this would add the offset to already existing
errors, so
```fish
(foo)
(bar)
something
```
would see the "(foo)" error, store the correct error location, then
see the "(bar)" error, and *add the offset of (bar)* to the "(foo)"
error location.
Solve this by making a new error list and appending it to the existing
ones.
There's a few other ways to solve this, including:
- Stopping after the first error (we only display the first anyway, I
think?)
- Making it so the source location has an "absolute" flag that shows
the offset has already been added (but do we ever need to add two offsets?)
I went with the simpler fix.
This would break the location of any prior errors without doing
anything of value.
E.g.
```fish
echo foo | exec grep # this exec is not allowed!
$status
somethingelse # The error might be found here!
```
Would apply the offset of `$status` to the offset of `exec`, locating
the error for `exec` somewhere after $status!
Allows the compiler to know our bespoke assert functions
are cold paths. This would normally occur somehow for real assert().
Assembly does appear it will save some branches.
Also don't worry about NDEBUG
(This doesn't matter because we rolled our own assert functions.
Thanks @zanchey.)
Just guess anew when it's not set.
(this still uses the value of $fish_emoji_width, but clamped to 1 or 2
- we could also guess if it's an unusable value, but that's a
different issue and tbh this variable is becoming less and less useful
as time moves on and things move to the new widths by default)
Fixes#8274.
This makes it so we treat backspaces as width -1, but never go below a
0 total width when talking about *lines*, like in screen or string
length --visible.
Fixes#8277.
OpenBSD has a posix_spawn implementation which fails to return ENOEXEC
on a shebangless script, causing us to fail the shebangless tests.
Disable posix_spawn on OpenBSD.
OpenBSD's mmap is famously unsychronized with file IO. In theory fsync
and msync can be used to synchronize but I was unable to get it to work.
Just don't use mmap for history on OpenBSD. This fixes the history merge
tests.
When getting the hostname to construct the legacy uvar path, if the
hostname is empty, we will create a path pointing at a directory. On
BSDs this path can be successfully open'd and we will produce errors
about invalid uvar files.
The "linear" wildcard_match actually contained a bug that compared two
strings on every iteration, causing this to be much slower than
necessary. Fix this.
To broadcast a uvar change on Linux, we write to a named pipe, wait a bit,
and then read it back. While the pipe is readable, fish will enter a "polling
mode" where it will check for uvar changes every N msec, until the pipe is no
longer readable. If the pipe stays readable for too long (5 seconds), fish
will try to drain it; this may happen if broadcasting instance of fish is
killed before it can read back its data.
In #8209 we have a case where fish is launched in the background to set a
uvar, and then immediately exits, leaving data on the pipe. This means that
we are perpetually in a polling mode until we hit that timeout. Reduce the
timeout to 1 second and the polling interval to 10 msec.
This improves #8209; it doesn't fix it fully but I think it's the best we can
do absent some other IPC mechanism.
Now that we removed EROTTEN which had the same error code as EPERM,
we can give a less confusing error in case a user has not allowed
their terminal access to a directory.
See #8264
When cd is passed a broken symlink, this changes the error message from
"no such directory" to "broken symbolic link". This scenario probably
won't happen very often since completion won't suggest broken symlinks
but it can't hurt to give a good error.
Fish used to do this until 7ac5932. This logic used to be in
path_get_cdpath, however, that is only used for highlighting, so we
don't need error messages there. Changing cd is enough.
Reword from "rotten" to "broken" since that's what file(1) uses.
Clean-up leftovers from old "rotten" code (nomen est omen).
See #8264
This currently changes builtin realpath with the "-s" option:
builtin realpath -s ///tmp
previously would print "///tmp", now it prints "/tmp".
The only thing "allow_leading_double_slashes" does is allow *two*
slashes.
This is important for `path match`, to be introduced in #8265.
Tmux has support for wrapping arbitrary escape sequences inside
```
\ePtmux;\e%s\e\\
```
Since this ends like the screen title escape, we just reuse that.
Characteristically, this is basically undocumented, but we already use
it in e.g. fish_vi_cursor.
The default matching logic for fish_tests was prefix based, so when we
were running `history` we were also running all history tests. This
causes the test to fail for an unknown reason.
Even though we are using CMake's ctest for testing, we still define our
own `make test` target rather than use its default for many reasons:
* CMake doesn't run tests in-proc or even add each tests as an
individual node in the ninja dependency tree, instead it just bundles
all tests into a target called `test` that always just shells out to
`ctest`, so there are no build-related benefits to not doing that
ourselves.
* CMake devs insist that it is appropriate for `make test` to never
depend on `make all`, i.e. running `make test` does not require any
of the binaries to be built before testing.
* The only way to have a test depend on a binary is to add a fake test
with a name like "build_fish" that executes CMake recursively to
build the `fish` target.
* It is not possible to set top-level CTest options/settings such as
CTEST_PARALLEL_LEVEL from within the CMake configuration file.
* Circling back to the point about individual tests not being actual
Makefile targets, CMake does not offer any way to execute a named
test via the `make`/`ninja`/whatever interface; the only way to
manually invoke test `foo` is to to manually run `ctest` and specify
a regex matching `foo` as an argument, e.g. `ctest -R ^foo$`... which
is really crazy.
With this patch, it is now possible to execute any single test by name,
by invoking the build directly, e.g. to run the `universal.fish` check:
`cmake --build build --target universal.fish` or
`ninja -C build universal.fish`. Unfortunately, this is not integrated
into the Makefile wrapper, so `make universal.fish` won't work (although
this can potentially be hacked around).
Instead of compiling `fish_tests.cpp` dynamically with weakly-linked
symbols and asking it to print the list of all available tests, we
use a magic string `#define`'d as a no-op to allow CMake to regex search
for matching test groups. This speeds up configuration somewhat (by not
compiling anything), but more importantly, it's much less brittle and
doesn't involve and linker dark magic.
There's of course still no getting around the fact that it's really ugly.
We have a *lot* of color sequences to try and tparm is slow (on the
whole, when you do this thousands of times).
So let's just check colors last, which makes everything else (which is
comparatively nothing) faster, while barely impacting
colors (benchmarking confirms no measurable difference).
Fixes#8253.
Fixes#8232.
Note that this needed to have expect_prompt used in the pexpect test -
we might want to add a "catchup" there so you can just ignore the
prompt counter for a bit and pick it back up later.
* Remove safe_strerror, safe_perror and safe_append
This no longer works on new glibcs because they removed sys_errlist.
So just hardcode the relevant errno messages (and phrase them better).
Fixes#4183.
Co-authored-by: Johannes Altmanninger <aclopte@gmail.com>
The clang warning for pending_signals_t was about the operator=
return type being wrong (misc-unconventional-assign-operator).
Signed-off-by: Rosen Penev <rosenp@gmail.com>
We don't want to convert the input to a "wcstring &" because
"stage_variables" needs to have the same type as other stages, so we
can use it in a loop. Communicate that to clang-tidy.
We also don't want to take "wcstring &&". As the Google style guide
states, it's not really beneficial here, and it potentially hurts
readability because it's a relatively obscure feature.
The rest of our code contains a bunch of && parameters. We might
want to get rid of some of them.
Closes#8227
clang-tidy wrongly sees an std::move to a const ref parameter and
believes it to be pointless. The copy constructor however is deleted.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
This disables job control inside command substitutions. Prior to this
change, a cmdsub might get its own process group. This caused it to fail
to cancel loops properly. For example:
while true ; echo (sleep 5) ; end
could not be control-C cancelled, because the signal would go to sleep,
and so the loop would continue on. The simplest way to fix this is to
match other shells and not use job control in cmdsubs.
Related is #1362
The previous layout confused me for a minute as it suggested it was
possible for `pipe_next_read` to be moved twice (once in the first
conditional block, then again when the deferred process conditional
called `continue` - if and only if the deferred process *was* the last
process in the job. This patch clarifies that can't be the case.
`pipe_next_read` is moved in the body of the loop, and not
re-initialized the last go around. However, we call
`pipe_next_read.close()` after the loop, which is undefined behavior (as
it's been moved).
Best case scenario, the compiler passed the address of our copy of the
struct to `exec_process_in_job` and beyond, it went out of scope there,
the value of `fd` was set to closed (minus one), and we explicitly call
`.close()` again, in which case it does nothing.
Worst case scenario, the compiler re-uses the storage for the now-moved
struct for something else and our call to `.close()` ends up closing
some other value of `fd` (valid or invalid) and things break.
Aside from the fact that we obviously don't need to close it since it's
not assigned for the last process in the job, it's a RAII object so we
don't have to worry about manually closing it in the first place.
`escape_code_length()` was converted from returning a `size_t` to
returning a `maybe_t<size_t>` but that subtly broke all existing call
sites by forcing all input to go through the slow path of assuming a
zero-length escape sequence was found.
This is because all callers predicated their next action on what amounts
to `if (escape_code_length(...))` which would correctly skip the slow
path when `escape_code_length` returned zero, but after the conversion
to `maybe_t` contained not `maybe_t::none()` but rather
`maybe_t::some(0)` due to coercion of the result from the `size_t` local
`esc_seq_len` to the `maybe_t<size_t>` return value - which, when
coerced to a boolean returns *true* for `maybe_t::some(0)` rather than
false.
The regression was introduced in 7ad855a844
and did not ship in any released versions so no harm, no foul.
This is required for the usage of placement new. Not an issue for fish
as it gets picked up from elsewhere, but it lets one use it in a C++
test directly this way.
* commandline: Add --is-valid option to query whether it's syntactically complete
This means querying when the commandline is in a state that it could
be executed. Because our `execute` bind function also inserts a
newline if it isn't.
One case that's not handled right now: `execute` also expands
abbreviations, those can technically make the commandline invalid
again.
Unfortunately we have no real way to *check* without doing the
replacement.
Also since abbreviations are only available in command position when
you _execute_ them the commandline will most likely be valid.
This is enough to make transient prompts work:
```fish
function reset-transient --on-event fish_postexec
set -g TRANSIENT 0
end
function maybe_execute
if commandline --is-valid
set -g TRANSIENT 1
commandline -f repaint
else
set -g TRANSIENT 0
end
commandline -f execute
end
bind \r maybe_execute
```
and then in `fish_prompt` react to $TRANSIENT being set to 1.
Because we are, ultimately, interested in how many cells a string
occupies, we *have* to handle carriage return (`\r`) and line
feed (`\n`).
A carriage return sets the current tally to 0, and only the longest
tally is kept. The idea here is that the last position is the same as
the last position of the longest string. So:
abcdef\r123
ends up looking like
123def
which is the same width as abcdef, 6.
A line feed meanwhile means we flush the current tally and start a new
one. Every line is printed separately, even if it's given as one.
That's because, well, counting the width over multiple lines
doesn't *help*.
As a sidenote: This is necessarily imperfect, because, while we may
know the width of the terminal ($COLUMNS), we don't know the current
cursor position. So we can only give the width, and the user can then
figure something out on their own.
But for the common case of figuring out how wide the prompt is, this
should do.
Without escapes.
The new option is a bit cheesy, but "width" isn't as expressive and
requires an argument.
Maybe we want "pad" to also require --visible?
* Add `set --function`
This makes the function's scope available, even inside of blocks. Outside of blocks it's the toplevel local scope.
This removes the need to declare variables locally before use, and will probably end up being the main way variables get set.
E.g.:
```fish
set -l thing
if condition
set thing one
else
set thing two
end
```
could be written as
```fish
if condition
set -f thing one
else
set -f thing two
end
```
Note: Many scripts shipped with fish use workarounds like `and`/`or`
instead of `if`, so it isn't easy to find good examples.
Also, if there isn't an else-branch in that above, just with
```fish
if condition
set -f thing one
end
```
that means something different from setting it before! Now, if
`condition` isn't true, it would use a global (or universal) variable of
te same name!
Some more interesting parts:
Because it *is* a local scope, setting a variable `-f` and
`-l` in the toplevel of a function ends up the same:
```fish
function foo2
set -l foo bar
set -f foo baz # modifies the *same* variable!
end
```
but setting it locally inside a block creates a new local variable
that shadows the function-scoped variable:
```fish
function foo3
set -f foo bar
begin
set -l foo banana
# $foo is banana
end
# $foo is bar again
end
```
This is how local variables already work. "Local" is actually "block-scoped".
Also `set --show` will only show the closest local scope, so it won't
show a shadowed function-level variable. Again, this is how local
variables already work, and could be done as a separate change.
As a fun tidbit, functions with --no-scope-shadowing can now use this to set variables in the calling function. That's probably okay given that it's already an escape hatch (but to be clear: if it turns out to problematic I reserve the right to remove it).
Fixes#565