Allow ** glob segments to match zero directories

Prior to this change, a glob like `**/file.txt` would only match
`file.txt` in subdirectories; the `**` must match at least one directory.
This is historical behavior.

With this change we move a little closer to bash's implementation by
allowing a literal `**` segment to match in the current directory. That
is, `**/foo` will match both `foo` and `bar/foo`, while `b**/foo` will
only match `bar/foo`.

Fixes #7222.
This commit is contained in:
ridiculousfish 2020-12-28 18:11:47 -08:00
parent 6c08141682
commit 43505f7077
4 changed files with 28 additions and 5 deletions

View file

@ -45,6 +45,7 @@ Syntax changes and new commands
-------------------------------
- Range limits in index range expansions like ``$x[$start..$end]`` may be omitted: ``$start`` and ``$end`` default to 1 and -1 (the last item) respectively.
- Logical operators ``&&`` and ``||`` can be followed by newlines before their right operand, matching POSIX shells.
- When globbing, a segment which is exactly ``**`` may now match zero directories. For example ``**/foo`` may match ``foo`` in the current directory (:issue:`7222`).
Scripting improvements
----------------------

View file

@ -498,9 +498,9 @@ Wildcards ("Globbing")
When a parameter includes an :ref:`unquoted <quotes>` ``*`` star (or "asterisk") or a ``?`` question mark, fish uses it as a wildcard to match files.
- ``*`` can match any string of characters not containing ``/``. This includes matching an empty string.
- ``*`` matches any number of characters (including zero) in a file name, not including ``/``.
- ``**`` matches any string of characters. This includes matching an empty string. The matched string can include the ``/`` character; that is, it goes into subdirectories. If a wildcard string with ``**`` contains a ``/``, that ``/`` still needs to be matched. For example, ``**\/*.fish`` won't match ``.fish`` files directly in the PWD, only in subdirectories. In fish you should type ``**.fish`` to match files in the PWD as well as subdirectories. [#]_
- ``**`` matches any number of characters (including zero), and also descends into subdirectories. If ``**`` is a segment by itself, that segment may match zero times, for compatibility with other shells.
- ``?`` can match any single character except ``/``. This is deprecated and can be disabled via the ``qmark-noglob`` :ref:`feature flag<featureflags>`, so ``?`` will just be an ordinary character.
@ -541,7 +541,6 @@ Examples::
end
# Lists the .foo files, if any.
.. [#] Unlike other shells, notably zsh.
.. [#] Technically, unix allows filenames with newlines, and this splits the ``find`` output on newlines. If you want to avoid that, use find's ``-print0`` option and :ref:`string split0<cmd-string-split0>`.
.. _expand-command-substitution:

View file

@ -930,6 +930,20 @@ void wildcard_expander_t::expand(const wcstring &base_dir, const wchar_t *wc,
}
} else {
assert(!wc_segment.empty() && (segment_has_wildcards || is_last_segment));
if (!is_last_segment && wc_segment == wcstring{ANY_STRING_RECURSIVE}) {
// Hack for #7222. This is an intermediate wc segment that is exactly **. The
// tail matches in subdirectories as normal, but also the current directory.
// That is, '**/bar' may match 'bar' and 'foo/bar'.
// Implement this by matching the wildcard tail only, in this directory.
// Note if the segment is not exactly ANY_STRING_RECURSIVE then the segment may only
// match subdirectories.
this->expand(base_dir, wc_remainder, effective_prefix);
if (interrupted_or_overflowed()) {
return;
}
}
DIR *dir = open_dir(base_dir);
if (dir) {
if (is_last_segment) {
@ -942,10 +956,9 @@ void wildcard_expander_t::expand(const wcstring &base_dir, const wchar_t *wc,
effective_prefix + wc_segment + L'/');
}
// If we have a recursive wildcard in this segment, we want to recurse into
// subdirectories.
size_t asr_idx = wc_segment.find(ANY_STRING_RECURSIVE);
if (asr_idx != wcstring::npos) {
// Apply the recursive **.
// Construct a "head + any" wildcard for matching stuff in this directory, and an
// "any + tail" wildcard for matching stuff in subdirectories. Note that the
// ANY_STRING_RECURSIVE character is present in both the head and the tail.

View file

@ -74,6 +74,16 @@ string join \n **a2/** | sort
# CHECK: dir_a1/dir_a2/dir_a3
# CHECK: dir_a1/dir_a2/dir_a3/file_a
rm -Rf *
# Special behavior for #7222.
# The literal segment ** matches in the same directory.
mkdir foo
touch bar foo/bar
string join \n **/bar | sort
# CHECK: bar
# CHECK: foo/bar
# Clean up.
cd $HOME
rm -Rf $tmpdir