Blocks will soon need to be shared across parsers. Migrate the loop status
(like break or continue) from the block into the libdata. It turns out we
only ever need one, we don't need to track this per-block.
Make it an enum class.
Brace expansion with single words in it is quite useless - `HEAD@{0}`
expanding to `HEAD@0` breaks git.
So we complicate the rule slightly - if there is no variable expansion
or "," inside of braces, they are just treated as literal braces.
Note that this is technically backwards-incompatible, because
echo foo{0}
will now print `foo{0}` instead of `foo0`. However that's a
technicality because the braces were literally useless in that case.
Our tests needed to be adjusted, but that's because they are meant to
exercise this in weird ways.
I don't believe this will break any code in practice.
Fixes#5869.
Prior to this fix, a function_block stored a process_t, which was only used
when printing backtraces. Switch this to an array of arguments, and make
various other cleanups around null terminated argument arrays.
We previously checked if fish_mode_prompt existed as a function, but
that's a bad change for those who already set it to an empty function
to have a mode display elsewhere.
Updated widechar_width takes care of it.
Technically, this does ~3 comparisons more per-character (because it
checks variation selectors and such), but that shouldn't really matter.
get_current_winsize() is intended to be lazy. It does the following:
1. Gets the termsize from the kernel
2. Compares it against the current value
3. If changed, sets COLUMNS and LINES variables
Upon setting these variables, we notice that the termsize has changed
and invalidate the termsize. Thus we were doing this work multiple times
on every screen repaint.
Put back an old hack that just marked the termsize as valid at the end
of get_current_winsize().
This just sets some special characters that we use in the reader, so
it only needs to be done before the reader is set up.
Which, as it stands, is in env_init().
This stops trying to see if the previous line is wider if it is a
prefix of the current one.
Which turns out to be true often enough that it's a net benefit.
This passes character width as an argument for a few functions.
In particular, it hardcodes a width of "1" for a space literal.
There's no reason to compute wcwidth for the length of the prompt.
This measured *all* the characters on the commandline, and saved all
of them in another wcstring_list_t, just to then do... nothing with
that info.
Also, it did wcslen for something that we already have as wcstring,
reserved a vector and did a bunch of work for autosuggestions that
isn't necessary if we have more than one line.
Instead, we do what we need, which is to figure out if we are
multiline and how wide the first line is.
Fixes#5866.
line_shared_prefix explains in its comment that
> If the prefix ends on a combining character, do not include the
previous character in the prefix.
But that's not what it does.
Instead, what it appears to do is to return idx for *every* combining
mark. This seems wrong to begin with, and it also requires checking
wcwidth for *every* character.
So instead we don't do that. If we find the mismatch, we check if it's
a combining mark, and then go back to the previous character (i.e. the
one before the one that the combining mark is for).
My tests found no issues with this, other than a 20% reduction in
pasting time.
The old commit #3f820f0 "Disable ONLCR mapping of NL output to CR-NL"
incorrectly used c_iflag instead of c_oflag, and I copied that error
in my patch. Fixed that. However, there seems to be other problems
trying to use "\x1B[A", which I have not tried to debug, so comment that out.
(However, #3f820f0 seems to mostly work if we fix it to use c_oflag.)
This read something like `o=!_validate_int`, and the flag modifier
reading kept the pointer after the `!`, so it created a long flag
called `_validate_int`, which meant it would not only error out form
```fish
argparse 'i=!_validate_int' 'o=!_validate_int' -- $argv
```
with "Long flag '_validate_int' already defined", but also set
$_flag_validate_int.
Fixes#5864.
As mentioned in #2900, something like
```fish
test -n "$var"; and set -l foo $var
```
is sufficiently idiomatic that it should be allowable.
Also fixes some additional weirdness with semicolons.
This runs build_tools/style.fish, which runs clang-format on C++, fish_indent on fish and (new) black on python.
If anything is wrong with the formatting, we should fix the tools, but automated formatting is worth it.
This removes semicolons at the end of the line and collapses
consecutive ones, while replacing meaningful semicolons with newlines.
I.e.
```fish
echo;
```
becomes
```fish
echo
```
but
```fish
echo; echo
```
becomes
```fish
echo
echo
```
Fixes#5859.
This was a sort of side channel that was only used to propagate redraws
after universal variable changes. We can eliminate it and handle these
more directly.
tsan does funny things to signals, preventing signals from being delivered
in a blocking read. Switch the topic monitor to non-blocking reads under
tsan.
This keeps all unknown options in $argv, so
```fish
argparse -i a/alpha -- -a banana -o val -w
```
results in $_flag_a set to banana, and $argv set to `-o val -w`.
This allows users to use multiple argparse passes, or to simply avoid
specifying all options e.g. in completions - `systemctl` has 46 of
them, most not having any effect on the completions.
Fixes#5367.
This cleans up how functions are stored and autoloaded. It eliminates the
recursive lock. Instead there is a single normal owning_lock that protects
the entirety of the function data. Autoloading is re-implemented via the
new autoloader_t.
autoloader_t will be the reimplementation of autoloading. Crucically it no
longer manages any locking or loading itself; instead all locking and loading
is performed by clients. This makes it easier to test and helps limit its
responsibilities.
autoloading has a "feature" where functions are removed in an LRU-fashion.
But there's hardly any benefit in removing autoloaded functions. Just stop
doing it.
This is a long-standing issue with how `complete --do-complete` does
its argument parsing: It takes an optional argument, so it has to be
attached to the token like `complete --do-complete=foo` or (worse)
`complete -Cfoo`.
But since `complete` doesn't take any bare arguments otherwise (it
would error with "too many arguments" if you did `complete -C foo`) we
can just take one free argument as the argument to `--do-complete`.
It's more of a command than an option anyway, since it entirely
changes what the `complete` call _does_.
* Some comment fixes and renaming of is_iterm2_escape_seq.
The comment for is_iterm2_escape_seq incorrectly says "CSI followed by ]".
This is wrong, because CSI is ESC followed by [ (or the seldom-used 0x9b).
The procedure actually matches Operating System Command (OSC) escape codes.
Since there is nothing iterm2-specific about OSC, is_osc_escape_seq
would be a better name.
Also s_desired_append_char documents a non-existent parameter.
* Update broken iterm2 url in comment.
This was added in 04a96f6 but not strictly required to fix#5803
(verified), with the intention of hiding invisible background jobs
(created by invoking a function within a pipeline) from the user, but
that also broke intentionally created jobs from displaying as well.
I'm thinking it can't be done without keeping track of caller context vs
job context.
Closes#5824.
env_scoped_t lives between environment_t and env_stack_t.
It represents the read-only logic of env_stack_t and will be used to back
the new environment snapshot implementation.
These tests used raw, unescaped parentheses to perform `test` logical
grouping, but the test failures weren't caught because the parser
evaluation errors were not being propagated (fixed in bdbd173e).
It was unconditionally returning `parse_execution_success`. This was
causing certain parser errors to incorrectly return after evaluation
with `$status` equal to `0`, as reported after `eval`, `source`, or
sub-`fish` execution.
Prior to this change, fish used a global flag to decide if we should check
for changes to universal variables. This flag was then checked at arbitrary
locations, potentially triggering variable updates and event handlers for
those updates; this was very hard to reason about.
Switch to triggering a universal variable update at a fixed location,
after running an external command. The common case is that the variable
file has not changed, which we can identify with just a stat() call, so
this is pretty cheap.
This reverts commit cdce8511a1.
This change was unsafe. The prior version (now restored) took the lock and
then copied the data. By returning a reference, the caller holds a
reference to data outside of the lock.
This function isn't worth optimizing. Hardly any functions use this
facility, and for those that do, they typically just capture one or two
variables.
* Convert `function_get_inherit_vars()` to return a reference to the
(possibly) existing map, rather than a copy;
* Preallocate and reuse a static (read-only) map for the (very) common
case of no inherited vars;
* Pass references to the inherit vars map around thereafter, never
triggering the map copy (or even move) constructor.
NB: If it turns out the reference is unsafe, we can switch the inherit vars
to be a shared_ptr and return that instead.
I did not realize builtins could safely call into the parser and inject
jobs during execution. This is much cleaner than hacking around the
required shape of a plain_statement.
- fix the carat position expanding e.g. `command $,`
- improve the error reporting for not-allowed command subtitutions
by figuring out where the expansion failed instead of using
SOURCE_LOCATION_UNKNOWN
- allow nullptr for parse_util_licate_brackets_range() out_string
argument if we don't need it to do any work.
Fixes#5812
`eval` has always been implemented as a function, which was always a bit
of a hack that caused some issues such as triggering the creation of a
new scope. This turns `eval` into a decorator.
The scoping issues with eval prevented it from being usable to actually
implement other shell components in fish script, such as the problems
described in #4442, which should now no longer be the case.
Closes#4443.
While `eval` is still a function, this paves the way for changing that
in the future, and lets the proc/exec functions detect when an eval is
used to allow/disallow certain behaviors and optimizations.
This adds an option --print-rusage-self to the fish executable. When set,
this option prints some getrusage stats to the console in a human-readable
way. This will be used by upcoming benchmarking support.
Followup to 394623b.
Doing it in the parser meant only top-level jobs would be reaped after
being `disown`ed, as subjobs aren't directly handled by the parser.
This is also much cleaner, as now job removal is centralized in
`process_clean_after_marking()`.
Closes#5803.
This prevents the `disown` builtin from directly removing jobs out of
the jobs list to prevent sanity issues, as `disown` may be called within
the context of a subjob (e.g. in a function or block) in which case the
parent job might not yet be done with the reference to the child job.
Instead, a flag is set and the parser removes the job from the list only
after the entire execution chain has completed.
Closes#5720.
When popping a scope from the environment stack, we currently do a lot of
nonsense like looking for changed curses variables. We want to centralize
this in env_stack_t so that it can be migrated to the env_dispatch logic.
Move this logic up one level in preparation for doing that.
This new file is supposed to encapsulate all of the logic around
reacting to variable changes, as opposed to the environment core.
This is to help break up the env.cpp monolith.
Prior to this fix, a job would only inherit a pgrp from its parent if the
first command were external. There seems to be no reason for this
restriction and this causes tcsetgrp() churn, potentially cuasing SIGTTIN.
Switch to unconditionally inheriting a pgrp from parents.
This should fix most of #5765, the only remaining question is
tcsetpgrp from builtins.
Prior to this fix, in every call to job_continue, fish would reclaim the
foreground pgrp. This would cause other jobs in the pipeline (which may
have another pgrp) to receive SIGTTIN / SIGTTOU.
Only reclaim the foreground pgrp if it was held at the point of job_continue.
This partially addresses #5765
In tests we would like to arrange for an executable to invoke certain
system calls, e.g. to claim or relinquish control of the terminal. This is
annoying to do portably via e.g. perl. fish_test_helper is a little
program where we can add custom commands to make it act in certain ways.
This set the term modes to the shell-modes, including disabling
ICRNL (translating \cm to \cj) and echo.
The rationale given was that `reader_interactive_init()` would only be
called >= 250ms later, which I _highly_ doubt considering fish's total
startup time is 8ms for me.
The main idea was that this would stop programs like tmuxinator that
send shortcuts early from failing _iff_ the shortcut was \cj, which
also seems quite unusual.
This works both with `rm -i` and `read` in config.fish, because `read`
explicitly calls `reader_push`, which then initializes the shell modes.
The real fix would involve reordering our init so we set up the
modesetting first, but that's quite involved and the remaining issue
should barely happen, while it's fairly common to have issues with a
prompt in config.fish, and the workaround for the former is simpler, so let's leave it for now.
Partially reverts #2578.
Fixes#2980.
Putting larger members before smaller ones will reduce structure
sizes. bools are 1 byte. on 64bit systems I think they reduced:
wgetopt.h:46: 64 to 56 bytes
builtin_history.cpp:30: 48 to 32 bytes
builtin_status.cpp:91: 32 to 24 bytes
tinyexpr.cpp:69: 40 to 32 bytes
The data stored in these containers is small enough that it is worth
creating distinct sets for each lookup.
In a microbenchmark of these changes, the single-lookup version of the
function with lookups gated on the length of input (bypassed entirely if
the input is longer than the longest key in the container) provided a
1.5x-3.5x speedup over the previous implementation.
Additionally, as the collections are static and their contents are never
modified after startup, it makes no sense to continously calculate the
location of and allocate an iterator for the `!= foo.end()` comparison;
the end iterator is now statically cached.
I'm not expecting massive speed gains out of this change, but the parser
does perform enough of these to make it worth optimizing in this way.
This reverts commit 7a74198aa3.
Believe it or not this commit actually increased copying. When accepting
a value you know you're going to take ownership of, just accept it by
value; then temporaries can invoke the move ctor and blah blah blah.
We really need a lightweight refcounted pass-by-value string to make this
less error prone.
If we switch the bind mode, we add a "force-repaint" there just to
redraw the mode indicator.
That's quite wasteful and annoying, considering that sometimes the prompt can take
half a second.
So we add a "repaint-mode" function that just reexecutes the
mode-prompt and uses the cached values for the others.
Fixes#5783.
As it turns out it didn't work much better, and it fell behind in
support when it comes to things that wcwidth traditionally can't
express like variation selectors and hangul combining characters, but
also simply $fish_*_width.
I've had to tell a few people now to rebuild with widecharwidth after
sending them on a fool's errand to set X variable.
So keeping this option is doing our users a disservice.
* Add "expand-abbr" bind function
This can be used to explictly allow expanding abbreviations.
* Make expanding abbr explicit
NOTE: This accepts them for space only, we currently also do it for \n
and \r.
* Remove now dead code
We no longer trigger an abbr implicitly, so we can remove the code
that does it.
* Fix comment
[ci skip]
Directly access the job list without the intermediate job_iterator_t,
and remove functions that are ripe for abuse by modifying a local
enumeration of the same list instead of operating on the iterators
directly (e.g. proc.cpp iterates jobs, and mid-iteration calls
parser::job_remove(j) with the job (and not the iterator to the job),
causing an invisible invalidation of the pre-existing local iterators.
This printed weird things like
```fish
$ functions -x
functions: Unknown option '-x'
(Type 'help functions' for related documentation)
```
Instead, let's make it
```fish
$ functions -x
functions: Unknown option '-x'
(Type 'help functions' for related documentation)
```
This was printed basically everywhere.
The user knows what they executed on standard input.
A good example:
```fish
set c (subme 513)
```
used to print
```
fish: Too much data emitted by command substitution so it was discarded
set -l x (string repeat -n $argv x)
^
in function 'subme'
called on standard input
with parameter list '513'
in command substitution
called on standard input
```
and now it is
```
fish: Too much data emitted by command substitution so it was discarded
set -l x (string repeat -n $argv x)
^
in function 'subme' with arguments '513'
in command substitution
```
See #5434.
Now:
```
cd: Unknown option '-r'
~/dev/fish-shell/share/functions/cd.fish (line 40):
builtin cd $argv
^
in function 'cd' with arguments '-r'
in function 'f'
in function 'd'
in function 'b' with arguments '-1q --wurst'
in function 'a'
called on standard input
```
See #5434.
This printed things like
```
in function 'f'
called on standard input
in function 'd'
called on standard input
in function 'b'
called on standard input
in function 'a'
called on standard input
```
As a first step, it removes the empty lines so it's now
```
in function 'f'
called on standard input
in function 'd'
called on standard input
in function 'b'
called on standard input
in function 'a'
called on standard input
```
See #5434.
This switches env_var_t to be an immutable value type, and stores its
contents via a shared_ptr. This eliminates string copying when fetching
env_var_t values.
If a function process is deferred, allow it to be unbuffered.
This permits certain simple cases where functions are piped to external
commands to execute without buffering.
This is a somewhat-hacky stopgap measure that can't really be extended
to more general concurrent processes. However it is overall an improvement
in user experience that might help flush out some bugs too.
In a job, a deferred process is the last fish internal process which pipes
to an external command. Execute the deferred process last; this will allow
for streaming its output.
I believe this was selected to be artificially low for the sake
of it displaying well in prompts. But people should expect to get
the same output as can be gotten from `hostname`.
Fixes#5758
The code already allowed for variable width (multicell) *display* of the
newline omitted character, but there was no way to define it as being
more than one `wchar_t`.
This lets us use a string on console sessions (^J aka newline feed)
instead of an ambiguous character like `@` (used in some versions of
vim for ^M) or `~` (what we were using).
The system version of `wcwidth()` reflects the capabilities of the
system's own virtual terminal's view of the width of the character in
question, while fish's enhanced version (`widechar_wcwidth`) is much too
smart for most login terminals, which generally barely support anything
beyond ASCII text.
If, at startup, it is detected that we are running under a physical
console rather than within a terminal emulator running in a desktop
environment, take that as a hint to use the system-provided `wcwidth`.
The commit began passing the length of the wide string rather than the
length of the narrowed string after conversion via `wcstombs`. We *do*
have the actual length, but it's not (necessarily) the same as the
original value. We need to pass the result of `wcstombs` instead.
POSIX dictates here that incomplete conversions, like in
printf %d\n 15.2
or
printf %d 14g
are still printed along with any error.
This seems alright, as it allows users to silence stderr to accept incomplete conversions.
This commit implements it, but what's a bit weird is the ordering between stdout and stderr,
causing the error to be printed _after_, like
15
14
15.1: value not completely converted
14,2: value not completely converted
but that seems like a general issue with how we buffer the streams.
(I know that nonfatal_error is a copy of most of fatal_error - I tried
differently, and va_* is weird)
Fixes#5532.
Before this change, - was sorted with other punctuation before
A-Z. Now, it sorts above the rest of the characters.
This has a practical effect on completions, where when there are
both -s and --long with the same description, the short option
is now before the long option in the pager, which is what is now
selected when navigating `foo -<TAB>`. The long options can be
picked out with `foo --<TAB>`. Before, short options which
duplicated a long option literally could not be selected by
any means from the pager.
Fixes#5634
This tweaks wcsfilecmp such that certain punctuation characters will
come after A-Z.
A big win with `set <TAB>` - the __prefixed fish junk now comes
after the stuff users should care about.
This disables an extra round of escaping in the `string replace -r`
replacement string.
Currently, to add a backslash to an a or b (to "escape" it):
string replace -ra '([ab])' '\\\\\\\$1' a
7 backslashes!
This removes one of the layers, so now 3 or 4 works (each one escaped
for the single-quotes, so pcre receives two, which it reads as one literal):
string replace -ra '([ab])' '\\\\$1' a
This is backwards-incompatible as replacement strings will change
meaning, so we put it behind a feature flag.
The name is kinda crappy, though.
Fixes#5474.
As a simple replacement for `wc -l`.
This counts both lines on stdin _and_ arguments.
So if "file" has three lines, then `count a b c < file` will print 6.
And since it counts newlines, like wc, `echo -n foo | count` prints 0.
Mostly related to usage _(L"foo"), keeping in mind the _
macro does a wcstring().c_str() already.
And a smattering of other trivial micro-optimizations certain
to not help tangibly.
C++11 provides std::min/std::max which we're using all over,
obviating the need for our own templates for this.
util.h now only provides two things: get_time and wcsfilecmp.
This commit removes everything that includes it which doesn't
use either; most because they no longer need mini or maxi from
it but some others were #including it unnecessarily.
Hangul uses three codepoints to combine to one glyph. The first has a
width of 2 (like the final glyph), but the second and third were
assigned a width of 1, which seems to match EastAsianWidth.txt:
> 1160..11FF;N # Lo [160] HANGUL JUNGSEONG FILLER..HANGUL JONGSEONG SSANGNIEUN
Instead, we override that and treat the middle and end codepoint as combiners,
always, because there's no way to figure out what the terminal will
think and that's the way it's supposed to work.
If they stand by themselves or in another combination, they'll indeed
show up with a width of 1 so we'll get it wrong, but that's less
likely and not expressible with wcwidth().
Fixes#5729.
This only did prefix matching, which is generally less useful.
All existing users _should_ be okay with this since they want to
provide completions.
Fixes#5467.
Fixes#2318.
This addresses a few places where -Wswitch-enum showed one or two missing
case's for enum values.
It did uncover and fix one apparent oversight:
$ function asd -p 100
echo foo
end
$ functions --handlers-type exit
Event exit
asd
It looks like this should be showing a PID before 'asd' just like
job_exit handlers show the job id. It was falling
through to default: which just printed the function name.
$ functions --handlers-type exit
Event exit
100 asd
This tried to skip conversion if the locale had MB_CUR_MAX == 1, but
in doing so it just entered an infinite recursion (because
writestr(wchar_t*) called writestr(wchar_t*)).
Instead, just let wcstombs handle it.
Fixes#5724.
Since Unicode 9, the width of some characters changed to 2.
Depending on the system, it might have support for it, or it might
not.
Instead of hardcoding specific glibc etc versions, we check what the
system wcwidth says to "😃", U+1F603 "Grinning Face With Big Eyes".
The intention is to, in most cases, make setting $fish_emoji_width
unnecessary, but since it sets the "guessed_emoji_width", that variable still takes precedence if it is set.
Unfortunately this approach has some caveats:
- It relies on the locale being set to a unicode-supporting one.
(C.UTF-8 is unfortunately not standard, so we can't use it)
- It relies on the terminal's wcwidth having unicode9 support IFF the
system wcwidth does.
This is like #5722, but at runtime.
The additional caveat is that we don't try to achieve a unicode
locale, but since we re-run the heuristic when the locale changes (and
we try to get a unicode locale), we should still often get the correct
value.
Plus if you use a C locale and your terminal still displays emoji,
you've misconfigured your system.
Fixes#5722.