fish-shell/doc_src/cmds/string-match.rst

string-match - match substrings
===============================

Synopsis
--------

.. BEGIN SYNOPSIS

.. synopsis::

    string match [-a | --all] [-e | --entire] [-i | --ignore-case]
                 [-r | --regex] [-n | --index] [-q | --quiet] [-v | --invert]
                 PATTERN [STRING ...]

.. END SYNOPSIS

Description
-----------

.. BEGIN DESCRIPTION

``string match`` tests each *STRING* against *PATTERN* and prints matching substrings. Only the first match for each *STRING* is reported unless **-a** or **--all** is given, in which case all matches are reported.

If you specify the **-e** or **--entire** then each matching string is printed including any prefix or suffix not matched by the pattern (equivalent to ``grep`` without the **-o** flag). You can, obviously, achieve the same result by prepending and appending **\*** or **.*** depending on whether or not you have specified the **--regex** flag. The **--entire** flag is simply a way to avoid having to complicate the pattern in that fashion and make the intent of the ``string match`` clearer. Without **--entire** and **--regex**, a *PATTERN* will need to match the entire *STRING* before it will be reported.

Matching can be made case-insensitive with **--ignore-case** or **-i**.

If **--groups-only** or **-g** is given, only the capturing groups will be reported - meaning the full match will be skipped. This is incompatible with **--entire** and **--invert**, and requires **--regex**. It is useful as a simple cutting tool instead of ``string replace``, so you can simply choose "this part" of a string.

If **--index** or **-n** is given, each match is reported as a 1-based start position and a length. By default, PATTERN is interpreted as a glob pattern matched against each entire *STRING* argument. A glob pattern is only considered a valid match if it matches the entire *STRING*.

If **--regex** or **-r** is given, *PATTERN* is interpreted as a Perl-compatible regular expression, which does not have to match the entire *STRING*. For a regular expression containing capturing groups, multiple items will be reported for each match, one for the entire match and one for each capturing group. With this, only the matching part of the *STRING* will be reported, unless **--entire** is given.

When matching via regular expressions, ``string match`` automatically sets variables for all named capturing groups (``(?<name>expression)``). It will create a variable with the name of the group, in the default scope, for each named capturing group, and set it to the value of the capturing group in the first matched argument. If a named capture group matched an empty string, the variable will be set to the empty string (like ``set var ""``). If it did not match, the variable will be set to nothing (like ``set var``).  When **--regex** is used with **--all**, this behavior changes. Each named variable will contain a list of matches, with the first match contained in the first element, the second match in the second, and so on. If the group was empty or did not match, the corresponding element will be an empty string.

If **--invert** or **-v** is used the selected lines will be only those which do not match the given glob pattern or regular expression.

Exit status: 0 if at least one match was found, or 1 otherwise.

.. END DESCRIPTION

Examples
--------

.. BEGIN EXAMPLES

Match Glob Examples
^^^^^^^^^^^^^^^^^^^

::

    >_ string match '?' a
    a

    >_ string match 'a*b' axxb
    axxb

    >_ string match -i 'a??B' Axxb
    Axxb

    >_ echo 'ok?' | string match '*\?'
    ok?

    # Note that only the second STRING will match here.
    >_ string match 'foo' 'foo1' 'foo' 'foo2'
    foo

    >_ string match -e 'foo' 'foo1' 'foo' 'foo2'
    foo1
    foo
    foo2

    >_ string match 'foo?' 'foo1' 'foo' 'foo2'
    foo1
    foo2

Match Regex Examples
^^^^^^^^^^^^^^^^^^^^

::

    >_ string match -r 'cat|dog|fish' 'nice dog'
    dog

    >_ string match -r -v "c.*[12]" {cat,dog}(seq 1 4)
    dog1
    dog2
    cat3
    dog3
    cat4
    dog4

    >_ string match -r '(\d\d?):(\d\d):(\d\d)' 2:34:56
    2:34:56
    2
    34
    56

    >_ string match -r '^(\w{2,4})\1$' papa mud murmur
    papa
    pa
    murmur
    mur

    >_ string match -r -a -n at ratatat
    2 2
    4 2
    6 2

    >_ string match -r -i '0x[0-9a-f]{1,8}' 'int magic = 0xBadC0de;'
    0xBadC0de

    >_ echo $version
    3.1.2-1575-ga2ff32d90
    >_ string match -rq '(?<major>\d+).(?<minor>\d+).(?<revision>\d+)' -- $version
    >_ echo "You are using fish $major!"
    You are using fish 3!

    >_ string match -raq ' *(?<sentence>[^.!?]+)(?<punctuation>[.!?])?' "hello, friend. goodbye"
    >_ printf "%s\n" -- $sentence
    hello, friend
    goodbye
    >_ printf "%s\n" -- $punctuation
    .

    >_ string match -rq '(?<word>hello)' 'hi'
    >_ count $word
    0

.. END EXAMPLES
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00			`string-match - match substrings`
			`===============================`

			`Synopsis`
			`--------`

			`.. BEGIN SYNOPSIS`

docs synopsis: add HTML highlighing and automate manpage markup Recent synopsis changes move from literal code blocks to [RST line blocks]. This does not translate well to HTML: it's not rendered in monospace, so aligment is lost. Additionally, we don't get syntax highlighting in HTML, which adds differences to our code samples which are highlighted. We hard-wrap synopsis lines (like code blocks). To align continuation lines in manpages we need [backslashes in weird places]. Combined with the *, , and `` markup, it's a bit hard to get the alignment right. Fix these by moving synopsis sources back to code blocks and compute HTML syntax highlighting and manpage markup with a custom Sphinx extension. The new Pygments lexer can tokenize a synopsis and assign the various highlighting roles, which closely matches fish's syntax highlighing: - command/keyword (dark blue) - parameter (light blue) - operator like and/or/not/&&/\|\| (cyan) - grammar metacharacter (black) For manpage output, we don't project the fish syntax highlighting but follow the markup convention in GNU's man(1): bold text type exactly as shown. italic text replace with appropriate argument. To make it easy to separate these two automatically, formalize that (italic) placeholders must be uppercase; while all lowercase text is interpreted literally (so rendered bold). This makes manpages more consistent, see string-join(1) and and(1). Implementation notes: Since we want manpage formatting but Sphinx's Pygments highlighing plugin does not support manpage output, add our custom "synopsis" directive. This directive parses differently when manpage output is specified. This means that the HTML and manpage build processes must not share a cache, because the parsed doctrees are cached. Work around this by using separate cache locations for build targets "sphinx-docs" (which creates HTML) and "sphinx-manpages". A better solution would be to only override Sphinx's ManualPageBuilder but that would take a bit more code (ideally we could override ManualPageWriter but Sphinx 4.3.2 doesn't really support that). --- Alternative solution: stick with line blocks but use roles like :command: or :option: (or custom ones). While this would make it possible to produce HTML that is consistent with code blocks (by adding a bit of CSS), the source would look uglier and is harder to maintain. (Let's say we want to add custom formatting to the [\|] metacharacters in HTML. This is much easier with the proposed patch.) --- [RST line blocks]: https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#line-blocks [backslashes in weird places]: https://github.com/fish-shell/fish-shell/pull/8626#discussion_r782837750 2022-01-09 14:09:46 +00:00			`.. synopsis::`

			`string match [-a \| --all] [-e \| --entire] [-i \| --ignore-case]`
			`[-r \| --regex] [-n \| --index] [-q \| --quiet] [-v \| --invert]`
			`PATTERN [STRING ...]`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00
string-match.rst: fix wrong RST directive This would show up in the rendered version. 2022-01-02 11:10:58 +00:00			`.. END SYNOPSIS`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00
			`Description`
			`-----------`

			`.. BEGIN DESCRIPTION`

string docs: format options and arguments in line with other pages There are a number of items which don't fit cleanly into the styles used in the synopses, and have been left alone. 2022-03-12 14:22:00 +00:00			``string match`` tests each STRING against PATTERN and prints matching substrings. Only the first match for each STRING is reported unless -a or --all is given, in which case all matches are reported.
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00
string docs: format options and arguments in line with other pages There are a number of items which don't fit cleanly into the styles used in the synopses, and have been left alone. 2022-03-12 14:22:00 +00:00			If you specify the -e or --entire then each matching string is printed including any prefix or suffix not matched by the pattern (equivalent to ``grep`` without the -o flag). You can, obviously, achieve the same result by prepending and appending \* or .* depending on whether or not you have specified the --regex flag. The --entire flag is simply a way to avoid having to complicate the pattern in that fashion and make the intent of the ``string match`` clearer. Without --entire and --regex, a PATTERN will need to match the entire STRING before it will be reported.
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00
string docs: format options and arguments in line with other pages There are a number of items which don't fit cleanly into the styles used in the synopses, and have been left alone. 2022-03-12 14:22:00 +00:00			`Matching can be made case-insensitive with --ignore-case or -i.`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00
string docs: format options and arguments in line with other pages There are a number of items which don't fit cleanly into the styles used in the synopses, and have been left alone. 2022-03-12 14:22:00 +00:00			If --groups-only or -g is given, only the capturing groups will be reported - meaning the full match will be skipped. This is incompatible with --entire and --invert, and requires --regex. It is useful as a simple cutting tool instead of ``string replace``, so you can simply choose "this part" of a string.
string: Add "--groups-only" to match This adds a simple way of picking bits from a string that might be a bit nicer than having to resort to a full `replace`. Fixes #6056 2019-06-11 14:05:24 +00:00
string docs: format options and arguments in line with other pages There are a number of items which don't fit cleanly into the styles used in the synopses, and have been left alone. 2022-03-12 14:22:00 +00:00			`If --index or -n is given, each match is reported as a 1-based start position and a length. By default, PATTERN is interpreted as a glob pattern matched against each entire STRING argument. A glob pattern is only considered a valid match if it matches the entire STRING.`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00
string docs: format options and arguments in line with other pages There are a number of items which don't fit cleanly into the styles used in the synopses, and have been left alone. 2022-03-12 14:22:00 +00:00			`If --regex or -r is given, PATTERN is interpreted as a Perl-compatible regular expression, which does not have to match the entire STRING. For a regular expression containing capturing groups, multiple items will be reported for each match, one for the entire match and one for each capturing group. With this, only the matching part of the STRING will be reported, unless --entire is given.`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00
string docs: format options and arguments in line with other pages There are a number of items which don't fit cleanly into the styles used in the synopses, and have been left alone. 2022-03-12 14:22:00 +00:00			When matching via regular expressions, ``string match`` automatically sets variables for all named capturing groups (``(?<name>expression)``). It will create a variable with the name of the group, in the default scope, for each named capturing group, and set it to the value of the capturing group in the first matched argument. If a named capture group matched an empty string, the variable will be set to the empty string (like ``set var ""``). If it did not match, the variable will be set to nothing (like ``set var``). When --regex is used with --all, this behavior changes. Each named variable will contain a list of matches, with the first match contained in the first element, the second match in the second, and so on. If the group was empty or did not match, the corresponding element will be an empty string.
Add documentation for regex import 2020-11-26 03:55:48 +00:00
string docs: format options and arguments in line with other pages There are a number of items which don't fit cleanly into the styles used in the synopses, and have been left alone. 2022-03-12 14:22:00 +00:00			`If --invert or -v is used the selected lines will be only those which do not match the given glob pattern or regular expression.`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00
			`Exit status: 0 if at least one match was found, or 1 otherwise.`

			`.. END DESCRIPTION`

			`Examples`
			`--------`

			`.. BEGIN EXAMPLES`

			`Match Glob Examples`
			`^^^^^^^^^^^^^^^^^^^`

			`::`

			`>_ string match '?' a`
			`a`

			`>_ string match 'a*b' axxb`
			`axxb`

			`>_ string match -i 'a??B' Axxb`
			`Axxb`

docs: Use \ instead of \\ in examples (#7286) Instead of informing the bell character (hex 07), the example was using an escaped \ followed by x07. $ echo \\x07 \x07 $ echo \x07 $ echo \x07 \| od -a 0000000 bel nl 0000002 $ * docs: Use \u instead of \\u Instead of informing the Unicode character 慡, this example was using an escaped \ followed by u6161. $ echo \\u6161 \u6161 $ echo \u6161 慡 Before: $ string escape --style=var 'a1 b2'\\u6161 \| string unescape --style=var a1 b2\u6161 Now: $ string escape --style=var 'a1 b2'\u6161 \| string unescape --style=var a1 b2慡 2020-08-26 16:29:03 +00:00			`>_ echo 'ok?' \| string match '*\?'`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00			`ok?`

			`# Note that only the second STRING will match here.`
			`>_ string match 'foo' 'foo1' 'foo' 'foo2'`
			`foo`

			`>_ string match -e 'foo' 'foo1' 'foo' 'foo2'`
			`foo1`
			`foo`
			`foo2`

			`>_ string match 'foo?' 'foo1' 'foo' 'foo2'`
			`foo1`
			`foo2`

			`Match Regex Examples`
			`^^^^^^^^^^^^^^^^^^^^`

			`::`

			`>_ string match -r 'cat\|dog\|fish' 'nice dog'`
			`dog`

			`>_ string match -r -v "c.*[12]" {cat,dog}(seq 1 4)`
			`dog1`
			`dog2`
			`cat3`
			`dog3`
			`cat4`
			`dog4`

docs/string: Fix match examples One was just cosmetic (too many \\), one was actually broken because it had duplicated `{{`, possibly resulting from the doxygen conversion? [ci skip] 2020-06-19 19:23:51 +00:00			`>_ string match -r '(\d\d?):(\d\d):(\d\d)' 2:34:56`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00			`2:34:56`
			`2`
			`34`
			`56`

docs/string: Fix match examples One was just cosmetic (too many \\), one was actually broken because it had duplicated `{{`, possibly resulting from the doxygen conversion? [ci skip] 2020-06-19 19:23:51 +00:00			`>_ string match -r '^(\w{2,4})\1$' papa mud murmur`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00			`papa`
			`pa`
			`murmur`
			`mur`

			`>_ string match -r -a -n at ratatat`
			`2 2`
			`4 2`
			`6 2`

docs/string: Fix duplicated {} in match example Follow-up fix from c5f06cd. [ci skip] 2020-08-25 04:37:17 +00:00			`>_ string match -r -i '0x[0-9a-f]{1,8}' 'int magic = 0xBadC0de;'`
Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00			`0xBadC0de`

Add documentation for regex import 2020-11-26 03:55:48 +00:00			`>_ echo $version`
			`3.1.2-1575-ga2ff32d90`
			`>_ string match -rq '(?<major>\d+).(?<minor>\d+).(?<revision>\d+)' -- $version`
			`>_ echo "You are using fish $major!"`
			`You are using fish 3!`

			`>_ string match -raq ' *(?<sentence>[^.!?]+)(?<punctuation>[.!?])?' "hello, friend. goodbye"`
			`>_ printf "%s\n" -- $sentence`
			`hello, friend`
			`goodbye`
			`>_ printf "%s\n" -- $punctuation`
			`.`

			`>_ string match -rq '(?<word>hello)' 'hi'`
			`>_ count $word`
			`0`

Add individual documentation pages for string's subcommands This adds string-x.rst for each subcommand x of string. The main page (string.rst) is not changed, except that examples are shown directly after each subcommand. The subcommand sections in string.rst are created by textual inclusion of parts of the string-x.rst files. Subcommand man pages can be viewed with either of: ``` man string collect man string-collect string collect <press F1 or Alt-h> string collect -h ``` While `string -h ...` still prints the full help. Closes #5968 2019-10-27 09:56:24 +00:00			`.. END EXAMPLES`