descend_unique_hierarchy is used for the cd autosuggestion: if a directory
contains exactly one subdirectory and no other entries, then propose that
as part of the cd autosuggestion.
This had a bug: if the subdirectory is a symlink to the parent, we would
chase that, going around the loop suggesting a longer path until we hit
PATH_MAX.
Fix this by using the new API which provides the inode "for free," and
track whether we've seen this inode before. This is technically too
conservative since the inode may be for a directory on a different device,
but devices are not available for free so this would incur a cost. In
practice encountering the same inode twice with different devices in a
unique hierarchy is unlikely, and should it happen the consequences are
merely cosmetic: we fail to suggest a longer path.
This introduces dir_iter_t, a new class for iterating the contents of a
directory. dir_iter_t encapsulates the logic that tries to avoid using
stat() to determine the type of a file, when possible.
This uses wreaddir_resolving, which tries to use the dirent d_type
field if it exists. In that way, it can skip the `stat` to determine
if the given file is a directory.
This allows `cd` completions to skip stat in most cases:
```fish
strace -Ce newfstatat fish --no-config -c 'complete -C"cd /tmp/completion_test/"' >/dev/null
```
prints before:
```
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100,00 0,002627 2 1033 4 newfstatat
```
after:
```
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100,00 0,000054 1 31 3 newfstatat
```
for a directory with 1000 subdirectories.
(just `fish --no-config -c exit` does 26 newfstatat)
This should improve the situation with slow filesystems like fuse or
network fsen.
In case we have no d_type, we use `stat`, which would yield about the
same results.
The worst case is that we need directories *and* descriptions or the
"executable" flag (which we don't currently check for cd, if I read
this right?).
Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.
It's been a long time.
One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.
Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
What this did was
1. Find directory
2. Turn name into wcstring and return it
3. Turn name back into string for some operations
Instead, let's unglue the wcstringing from this, return the narrow
string and then widen it when we need.
This allows us to skip re-wcs2stringing the base_dir again and again
by simply using the fd. It's about 10% faster in my testing.
fstatat is defined by POSIX, so it should be available everywhere.
The "linear" wildcard_match actually contained a bug that compared two
strings on every iteration, causing this to be much slower than
necessary. Fix this.
Prior to this change, a glob like `**/file.txt` would only match
`file.txt` in subdirectories; the `**` must match at least one directory.
This is historical behavior.
With this change we move a little closer to bash's implementation by
allowing a literal `**` segment to match in the current directory. That
is, `**/foo` will match both `foo` and `bar/foo`, while `b**/foo` will
only match `bar/foo`.
Fixes#7222.
Historically fish has not supported tab completing or autosuggesting
wildcards with **. Prior to this fix, we would test every file match,
discover the ** wildcard, and then ignore it. Instead look for **
wildcards at the top level.
This prevents autosuggesting with /** from chewing up your disk.
When globbing, we have a base directory (typically $PWD) and a path
component relative to that. As PWD is "virtual" it may be a symlink. Prior
to this change we would use wrealpath to resolve symlinks before opening
the directory during a glob, but this call to wrealpath consumed roughly
half of the time during globbing, and is conceptually unnecessary as
opendir will resolve symlinks for us.
Remove it. This may have funny effects if the user's PWD is an unlinked
directory, but it roughly doubles the speed of a glob like `echo ~/**`.
This adds the ability to limit how many expansions are produced. For
example if $big contains 10 items, and is Cartesian-expanded as
$big$big$big$big... 10 times, we would naviely get 10^10 = 10 billion
results, which fish can't actually handle. Implement this in
completion_receiver_t, which now can return false to indicate an overflow.
The initial expansion limit 'k_default_expansion_limit' is set as 512k
items. There's no way for users to change this at present.
This switches certain uses from just appending to a list to using
completion_receiver_t, in preparation for limiting how many completions
may be produced. Perhaps in time this could also be used for "streaming"
completions.
In preparation for introducing "smart case", refactor string fuzzy
matching. Specifically split out the case folding and match type into
separate fields, so that we can introduce more case folding types without
a combinatoric explosion.
This is used to decide which fuzzy match is better, however it is used
only in wildcard expansion and not in actual completion ranking or
anywhere else where it could matter. Try removing the compare() call
and implementation.
What compare() did specially was compare distances, e.g. it ranks
lib as better than libexec when expanding /u/l/b. But the tests did not
exercise this so it's hard to know if it's working. In preparation for a
refactoring, remove it.
When expanding a string, you may or may not want to generate
descriptions alongside the expanded string. Usually you don't want to
but descriptions were opt out. This commit makes them opt in.
This was originally comparing two pointers for equality but after the
refactor to wcstring it ended up comparing a const string pointer to the
_contents_ of the wcstring.
This commit recognizes an existing pattern: many operations need some
combination of a set of variables, a way to detect cancellation, and
sometimes a parser. For example, tab completion needs a parser to execute
custom completions, the variable set, should cancel on SIGINT. Background
autosuggestions don't need a parser, but they do need the variables and
should cancel if the user types something new. Etc.
This introduces a new triple operation_context_t that wraps these concepts
up. This simplifies many method signatures and argument passing.