Commit graph

133 commits

Author SHA1 Message Date
Mahmoud Al-Qudsi
99bc112de0 Fix unqualified calls to std::move
`using` is for types, not functions :(
2022-10-19 12:31:55 -05:00
Mahmoud Al-Qudsi
85d4834b35 Make maybe_t safer against accidental misuse
Closes #9240.

Squash of the following commits (in reverse-chronological order):

commit 03b5cab3dc40eca9d50a9df07a8a32524338a807
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date:   Sun Sep 25 15:09:04 2022 -0500

    Handle differently declared posix_spawnxxx_t on macOS

    On macOS, posix_spawnattr_t and posix_spawn_file_actions_t are declared as void
    pointers, so we can't use maybe_t's bool operator to test if it has a value.

commit aed83b8bb308120c0f287814d108b5914593630a
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date:   Sun Sep 25 14:48:46 2022 -0500

    Update maybe_t tests to reflect dynamic bool conversion

    maybe_t<T> is now bool-convertible only if T _isn't_ already bool-convertible.

commit 2b5a12ca97b46f96b1c6b56a41aafcbdb0dfddd6
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date:   Sun Sep 25 14:34:03 2022 -0500

    Make maybe_t a little harder to misuse

    We've had a few bugs over the years stemming from accidental misuse of maybe_t
    with bool-convertible types. This patch disables maybe_t's bool operator if the
    type T is already bool convertible, forcing the (barely worth mentioning) need
    to use maybe_t::has_value() instead.

    This patch both removes maybe_t's bool conversion for bool-convertible types and
    updates the existing codebase to use the explicit `has_value()` method in place
    of existing implicit bool conversions.
2022-10-08 11:56:38 -05:00
Sergei Shilovsky
e274ef6c0d
commandline --selection-start and --selection-end implementation
Fixes #9197
2022-10-05 18:51:00 +02:00
Fabian Boehm
dcf52dbba5 fix path --null-out
Regression from 7bc4c9674b.

Appending `"\0"` to an std::string does nothing.

I blame C++.
2022-10-05 17:25:00 +02:00
Fabian Boehm
cb28b39b24 string shorten: Make max of 0 mean no shortening
This makes it easier to just slot in `string shorten` wherever,
without having to do a weird "if test $max -gt 0" check.
2022-10-04 18:44:21 +02:00
Mahmoud Al-Qudsi
5d64b56127 Remove needless usage of maybe_t
builtin_function() never returns `none()`; this must have been leftover from a
previous version of the code.
2022-09-25 14:40:49 -05:00
Mahmoud Al-Qudsi
1811a2d725 Prevent undefined behavior by intercepting return -1
While we hardcode the return values for the rest of our builtins, the `return`
builtin bubbles up whatever the user returned in their fish script, allowing
invalid return values such as negative numbers to make it into our C++ side of
things.

In creating a `proc_status_t` from the return code of a builtin, we invoke
W_EXITCODE() which is a macro that shifts left the return code by some amount,
and left-shifting a negative integer is undefined behavior.

Aside from causing us to land in UB territory, it also can cause some negative
return values to map to a "successful" exit code of 0, which was probably not
the fish script author's intention.

This patch also adds error logging to help catch any inadvertent additions of
cases where a builtin returns a negative value (should one forget that unix
return codes are always positive) and an assertion protecting against UB.
2022-09-25 12:33:40 -05:00
Fabian Boehm
e69be38235 string: Reduce write() calls
The impact here depends on the command and how much output it
produces.

It's possible to get up to 1.5x - `string upper` being a good example,
or a no-op `string match '*'`.

But the more the command actually needs to do, the less of an effect
this has.
2022-09-22 22:41:35 +02:00
Fabian Boehm
7bc4c9674b builtins: Reduce streams.out.append/push_back calls
This basically immediately issues a "write()" if it's to a pipe or the
terminal.

That means we can reduce syscalls and improve performance, even by
doing something like

```c++
streams.out.append(somewcstring + L"\n");
```

instead of

```c++
streams.out.append(somewcstring);
streams.out.push_back(L'\n');
```

Some benchmarks of the

```fish
for i in (string repeat -n 2000 \n)
    $thing
end
```

variety:

1. `set` (printing variables) sped up 1.75x
2. `builtin -n` 1.60x
3. `jobs` 1.25x (with 3 jobs)
4. `functions` 1.20x
5. `math 1 + 1` 1.1x
6. `pwd` 1.1x

Piping yields similar results, there is no real difference when
outputting to a command substitution.
2022-09-22 22:41:35 +02:00
Fabian Boehm
c5b5dd7563 printf: Buffer output
This writes the output once per argument instead of once per format or
escaped char.

An egregious case:

```fish
printf (string repeat -n 200 \\x7f)%s\n (string repeat -n 2000 aaa\n)
```

Has been sped up by ~20x by reducing write() calls from 40000 to 200.

Even a simple

```fish
printf %s\n (string repeat -n 2000 aaa\n)
```

should now be ~1.2x faster by issuing 2000 instead of 4000 write
calls (the `\n` was written separately!).
2022-09-22 22:41:35 +02:00
Fabian Boehm
64927677c8 complete: Write each completion at once for --do-complete
This at least halves the number of "write()" calls we do if it goes to
a pipe or the terminal, or reduces them by 75% if there is a
description.

This makes

```fish
complete -c foo -xa "(seq 50000)"
complete -C"foo "
```

faster by 1.33x.
2022-09-22 22:41:35 +02:00
ridiculousfish
5f4583b52d Revert "Re-implement macro to constexpr transition"
This reverts commit 3d8f98c395.

In addition to the issues mentioned on the GitHub page for this commit,
it also broke the CentOS 7 build.

Note one can locally test the CentOS 7 build via:

    ./docker/docker_run_tests.sh ./docker/centos7.Dockerfile
2022-09-20 11:58:37 -07:00
Fabian Boehm
8b1da4b63d path: Actually use mtime instead of ctime
Fixes #9222
2022-09-20 16:10:17 +02:00
Mahmoud Al-Qudsi
3d8f98c395 Re-implement macro to constexpr transition
Be more careful with sign extension issues stemming from the differences in how
an untyped literal is promoted to an integer vs how a typed (and signed) `char`
is promoted to an integer.
2022-09-19 18:10:41 -05:00
Mahmoud Al-Qudsi
7c3e4a7ccb Revert "Convert constant macros to constexpr expressions"
This reverts commit e1626818f7.
2022-09-19 17:42:11 -05:00
Mahmoud Al-Qudsi
e1626818f7 Convert constant macros to constexpr expressions
Also convert some `const[expr] static xxx` to `const[expr] xxx` where it makes
sense to let the compiler deduce on its own whether or not to allocate storage
for a constant variable rather than imposing our view that it should have STATIC
storage set aside for it.

A few call sites were not making use of the `XXX_LEN` definitions and were
calling `strlen(XXX)` - these have been updated to use `const_strlen(XXX)`
instead.

I'm not sure if any toolchains will have raise any issues with these changes...
CI will tell!
2022-09-19 17:17:09 -05:00
Aaron Gyes
168d74ab0e IWYU 2022-09-12 18:34:19 -07:00
Aaron Gyes
864bd4a9cb builtin bind: highlight output.
This highlights `bind` output, which is commands to reproduce the
current bind state, for interactive sessions ala builtin complete.
2022-09-12 15:33:07 -07:00
Fabian Boehm
52e065e479 math: Add error length
Like we now do for syntax errors, this marks the extent of the error.

Currently for unknown functions only, would be cool for division too
2022-09-09 18:52:45 +02:00
Fabian Boehm
5edba044a3 math: Give a proper error for division by zero
This errored out *later* because the result was infinite or NaN, but
it didn't actually stop evaluation.

I'm not sure if there is a way to get floating point math to turn an
infinity back into something that doesn't depend on a literal
infinity, but division by zero conceptually isn't a thing we can
support.

There's entire branches of maths dedicated to figuring out what
dividing by "basically zero" means and we don't have to get into it.
2022-09-09 18:52:45 +02:00
Fabian Boehm
41c22d5e60 Add string shorten
This is essentially the inverse of `string pad`.
Where that adds characters to get up to the specified width,
this adds an ellipsis to a string if it goes over a specific maximum width.
The char can be given, but defaults to our ellipsis string.
("…" if the locale can handle it and "..." otherwise)

If the ellipsis string is empty, it just truncates.

For arguments given via argv, it goes line-by-line,
because otherwise length makes no sense.

If "--no-newline" is given, it adds an ellipsis instead and removes all subsequent lines.

Like pad and `length --visible`, it goes by visible width,
skipping recognized escape sequences, as those have no influence on width.

The default target width is the shortest of the given widths that is non-zero.

If the ellipsis is already wider than the target width,
we truncate instead. This is safer overall, so we don't e.g. move into a new line.
This is especially important given our default ellipsis might be width 3.
2022-09-09 18:49:57 +02:00
ridiculousfish
3eae0a9b6a clang-format all C++ files
This mostly re-sorts headers that got desorted after the IWYU
application in 14d2a6d8ff.
2022-08-21 15:02:19 -07:00
Aaron Gyes
14d2a6d8ff IWYU-guided #include rejiggering.
Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.

It's been a long time.

One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.

Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
2022-08-20 23:55:18 -07:00
Fabian Boehm
7988cff6bd Increase the string chunk size to increase performance
This is a *tiny* commit code-wise, but the explanation is a bit
longer.

When I made string read in chunks, I picked a chunk size from bash's
read, under the assumption that they had picked a good one.

It turns out, on the (linux) systems I've tested, that's simply not
true.

My tests show that a bigger chunk size of up to 4096 is better *across
the board*:

- It's better with very large inputs
- It's equal-to-slightly-better with small inputs
- It's equal-to-slightly-better even if we quit early

My test setup:

0. Create various fish builds with various sizes for
STRING_CHUNK_SIZE, name them "fish-$CHUNKSIZE".
1. Download the npm package names from
https://github.com/nice-registry/all-the-package-names/blob/master/names.json (I
used commit 87451ea77562a0b1b32550124e3ab4a657bf166c, so it's 46.8MB)
2. Extract the names so we get a line-based version:

```fish
jq '.[]' names.json | string trim -c '"' >/tmp/all
```

3. Create various sizes of random extracts:

```fish
for f in 10000 1000 500 50
    shuf /tmp/all | head -n $f > /tmp/$f
end
```

(the idea here is to defeat any form of pattern in the input).

4. Run benchmarks:

hyperfine -w 3 ./fish-{128,512,1024,2048,4096}"
    -c 'for i in (seq 1000)
            string match -re foot < $f
        end; true'"

(reduce the seq size for the larger files so you don't have to wait
for hours - the idea here is to have some time running string and not
just fish startup time)

This shows results pretty much like

```
Summary
'./fish-2048     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true'' ran
  1.01 ± 0.02 times faster than './fish-4096     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true''
  1.02 ± 0.03 times faster than './fish-1024     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true''
  1.08 ± 0.03 times faster than './fish-512     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true''
  1.47 ± 0.07 times faster than './fish-128     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true''
```

So we see that up to 1024 there's a difference, and after that the
returns are marginal. So we stick with 1024 because of the memory
trade-off.

----

Fun extra:

Comparisons with `grep` (GNU grep 3.7) are *weird*. Because you both
get

```
'./fish-4096 -c 'for i in (seq 100); string match -re foot < /tmp/500; end; true'' ran
11.65 ± 0.23 times faster than 'fish -c 'for i in (seq 100); command grep foot /tmp/500; end''
```

and

```
'fish -c 'for i in (seq 2); command grep foot /tmp/all; end'' ran
66.34 ± 3.00 times faster than './fish-4096 -c 'for i in (seq 2);
string match -re foot < /tmp/all; end; true''
100.05 ± 4.31 times faster than './fish-128 -c 'for i in (seq 2);
string match -re foot < /tmp/all; end; true''
```

Basically, if you *can* give grep a lot of work at once (~40MB in this
case), it'll churn through it like butter. But if you have to call it
a lot, string beats it by virtue of cheating.
2022-08-15 20:16:12 +02:00
Aaron Gyes
2b2f772790 clarify "…variable is shadowed by the global variable of the same name"
Rephrase this to more explicitly indicate that the uvar actually
was successfully set. I believe the prior phrasing can leave some
ambiguity as far as wether set just failed with an error, whether it
has done anything or not.
2022-08-14 16:16:38 -07:00
Aaron Gyes
aacc71e585 builtin set: make error messages more consistent.
Now uses the same macro other builtins use for a missing -e arg,
and the error message show the short or long option as it was used.

e.g. before
    $ set -e
    set: Erase needs a variable name

after
    $ set --erase
    set: --erase: option requires an argument
    $ set -e
    set: -e: option requires an argument
2022-08-14 15:34:58 -07:00
ridiculousfish
2a0e0d6721 Remove the intern'd strings component
Intern'd strings were intended to be "shared" to reduce memory usage but
this optimization doesn't carry its weight. Remove it. No functional
change expected.
2022-08-13 12:51:36 -07:00
ridiculousfish
082f074bb1 Switch filenames from intern'd strings to shared_ptr
We store filenames in function definitions to indicate where the
function comes from. Previously these were intern'd strings. Switch them
to a shared_ptr<wcstring>, intending to remove intern'd strings.
2022-08-13 12:51:36 -07:00
Johannes Altmanninger
3dfacf4b39 builtin printf: suppress warnings about unused variables
No functional change.
2022-08-13 21:11:54 +02:00
Fabian Boehm
5fe43accef Add special error for set -o 2022-08-12 21:28:11 +02:00
Fabian Boehm
37f7818bbb printf: Ignore any options
This was misguidedly "fixed" in
9e08609f85, which made printf error out
with any "-"-prefixed words as the first argument.

Note: This means currently `printf --help` doesn't print the help.
This also matches `echo`, and we currently don't have anything to make
a literal `--help` execute a builtin help except for keywords. Oh well.

Fixes #9132
2022-08-10 16:55:56 +02:00
Fabian Boehm
eac808a819
string repeat: Don't allocate repeated string all at once (#9124)
* string repeat: Don't allocate repeated string all at once

This used to allocate one string and fill it with the necessary
repetitions, which could be a very very large string.

Now, it instead uses one buffer and fills it to a chunk size,
and then writes that.

This fixes:

1. We no longer crash with too large max/count values. Before they
caused a bad_alloc because we tried to fill all RAM.
2. We no longer fill all RAM if given a big-but-not-too-big value. You
could've caused fish to eat *most* of your RAM here.
3. It can start writing almost immediately, instead of waiting
potentially minutes to start.

Performance is about the same to slightly faster overall.
2022-08-09 19:58:56 +02:00
Johannes Altmanninger
8729623cec Make ESCAPE_ALL the default and call its inverse ESCAPE_NO_PRINTABLES
ESCAPE_ALL is not really a helpful name. Also it's the most common flag.
Let's make it the default so we can remove this unhelpful name.

While at it, let's add a default value for the flags argument, which helps
most callers.

The absence of ESCAPE_ALL makes it only escape nonprintable characters
(with some exceptions). We use this for displaying strings in the completion
pager as well as for the human-readable output of "set", "set -S", "bind"
and "functions".

No functional change.
2022-07-27 11:24:35 +02:00
Johannes Altmanninger
e5d5391687 Remove useless escaping of variable names
When listing variables, "set" tries to escape variable names.
Since variable names cannot have special characters, this doesn't do anything.

The escaping is one of the few places that does not use ESCAPE_ALL.  This has
complex behavior; let's alleviate the problem by getting rid of this call.

No functional change.
2022-07-27 11:24:35 +02:00
Johannes Altmanninger
3f90efca38 clang-format C++ files
Or should we stop using it?

I'm fine with either always or never using auto-formatting but our current
way of using it only sometimes is confusing.

No functional change.
2022-07-27 10:05:41 +02:00
Fabian Boehm
122b6c1734 status: Only realpath if we got an absolute path
Otherwise realpath would add the cwd, which would be broken if fish
ever cd'd.

We could add the original cwd, but even that isn't enough, because we
need *the parent's* idea of cwd and $PATH.

Or, alternatively, what we need is for the OS to give us the actual
path to ourselves.
2022-07-24 14:31:15 +02:00
Fabian Boehm
d241f0853e status: Do add the command name to the error 2022-07-24 13:17:06 +02:00
Fabian Boehm
4f1c62ff43 status: Realpath the executable path
get_executable_path says: "This needs to be realpath'd"

So how about we do that? The only other place we use it is fish.cpp,
and we realpath it there already.

See #9085
2022-07-24 12:36:32 +02:00
Fabian Boehm
407a455cfd realpath: Use physical PWD
This was an inadvertent change from
cc632d6ae9.

Because we used wgetcwd directly before, we always got the "physical"
resolved $PWD.

There's an argument to be made to use the logical $PWD here as well
but I prefer not to make changes lik that in a random commit without
good reason.
2022-07-18 20:45:30 +02:00
Fabian Boehm
5dfb64b547
Add path mtime (#9057)
This can be used to print the modification time, like `stat` with some
options.

The reason is that `stat` has caused us a number of portability
headaches:

1. It's not available everywhere by default
2. The versions are quite different

For instance, with GNU stat it's `stat -c '%Y'`, with macOS it's `stat
-f %m`.

So now checking a cache file can be done just with builtins.
2022-07-18 20:39:01 +02:00
Aaron Gyes
8f91ee7f6b builtin test: Implement -ot, -nt, -ef
These are non-POSIX extensions other test(1) utilities implement,
which compares the modification time of two files as proposed for
fish in #3589: testing if one file is newer than another file.

-ef is a common extension to test(1) which checks if two paths refer
to the same file, by comparing the dev and inode numbers.
2022-07-16 12:40:36 -07:00
Fabian Boehm
cc632d6ae9 realpath: Use the parser's working dir
Future proofing, similar to what we do in `path resolve`.
2022-07-12 20:53:57 +02:00
Aaron Gyes
cbd0ec568c builtins/path.cpp: remove <glob.h>
I don't believe we use any system glob faciltiies.
2022-07-09 21:11:43 -07:00
Aaron Gyes
3e0f3c9f45 path.cpp: include its actual header with the prototype
path.h: fix that header so it can compile.
2022-07-09 21:04:03 -07:00
ridiculousfish
f7c411d5a5 Further cleanup of builtin_string regex matching
Take advantage of additional cleanup unlocked by this refactoring,
including eliminating unneeded error returns and simplifying some
control flow.

No user-visible behavior change expected here.
2022-07-09 16:44:12 -07:00
ridiculousfish
d46f402cea Adopt the new re in builtin_string
This switches builtin_string from using PCRE2 directly, to using the new re
component. This simplifies some code and removes redundancy.

No user-visible behavior change expected here.
2022-07-09 16:41:15 -07:00
ridiculousfish
61b09ff4a7 Stop using a static unordered_map for string flag handlers
This switches the flag_to_function from a map to just an ordinary switch
statement. This saves some memory/startup time and removes some
relocations. No functional change here.
2022-07-04 13:40:55 -07:00
Fabian Boehm
98ba66ed8e set_color: Print the given colors with --print-colors 2022-07-01 21:28:35 +02:00
Fabian Boehm
bd7934ccbf history: Refuse to merge in private mode
It makes *no* sense.

Fixes #9050.
2022-07-01 20:10:18 +02:00
Fabian Boehm
dde2d33098 set --show: Show the originally inherited value, if any
This adds a line to `set --show`s output like

```
$PATH: originally inherited as |/home/alfa/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/flatpak/exports/bin|
```

to help with debugging.

Note that this means keeping an additional copy of the original
environment around. At most this would be one ARG_MAX's worth, which
is about 2M.
2022-06-27 20:33:26 +02:00