Quellcode-Bibliothek

^© Kompilation durch diese Firma

[Weder Korrektheit noch Funktionsfähigkeit der Software werden zugesichert.]

Datei: regexps.xml Sprache: XML

Original von: Isabelle^©

<?xml version="1.0" encoding="UTF-8"?>
<appendix id="regexps">
    <title>Regular Expressions</title>
    
    
    

    <para>jEdit uses regular expressions from <ulink
    url="https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html">java.util.regex.Pattern</ulink>
    to implement inexact search and replace. Click there to see a complete
    reference guide to all supported meta-characters.</para>

    <para>A regular expression consists of a string where some characters are
    given special meaning with regard to pattern matching.</para>

    <note>
        <title>Inside XML files</title>

        <para>Inside XML files (such as jEdit mode files), it is important that
        you escape XML special characters, such as &, <, >, etc. You
        can use the XML plugin's "characters to entities" to perform this
        mapping.</para>
    </note>

    <note>
        <title>Inside Java / beanshell / properties files</title>

        <para>Java strings are always parsed by java before they are processed
        by the regular expression engine, so you must make sure that backslashes
        are escaped by an extra backslash (<literal>\\</literal>)</para>
    </note>

    <para>Within a regular expression, the following characters have special
    meaning:</para>

    <bridgehead>Positional Operators</bridgehead>

    <itemizedlist>
        <listitem>
            <para><literal>^</literal> matches at the beginning of a line</para>
        </listitem>

        <listitem>
            <para><literal>$</literal> matches at the end of a line</para>
        </listitem>

        <listitem>
            <para><literal>\b</literal> matches at a word boundary</para>
        </listitem>

        <listitem>
            <para><literal>\B</literal> matches at a non-word break</para>
        </listitem>

        <listitem>
            <para><literal>\A</literal> The beginning of the input</para>
        </listitem>

        <listitem>
            <para><literal>\G</literal> The end of the previous match</para>
        </listitem>

        <listitem>
            <para><literal>\Z</literal> The end of the input but for the final terminator, if any</para>
        </listitem>

        <listitem>
            <para><literal>\z</literal> The end of the input</para>
        </listitem>
    </itemizedlist>

    <bridgehead>One-Character Operators</bridgehead>

    <itemizedlist>
        <listitem>
            <para><literal>.</literal> matches any single character (may or may not match line terminators)</para>
        </listitem>

        <listitem>
            <para><literal>\d</literal> matches any decimal digit (<literal>[0-9]</literal>)</para>
        </listitem>

        <listitem>
            <para><literal>\D</literal> matches any non-digit (<literal>[^0-9]</literal>)</para>
        </listitem>

        <listitem>
            <para><literal>\n</literal> matches the newline character (<literal>\u000A</literal>)</para>
        </listitem>

        <listitem>
            <para><literal>\s</literal> matches any whitespace character (<literal>[ \t\n\x0B\f\r]</literal>)</para>
        </listitem>

        <listitem>
            <para><literal>\x<replaceable>hh</replaceable></literal> matches hexadecimal character code <literal>0xhh</literal></para>
        </listitem>

        <listitem>
            <para><literal>\S</literal> matches any non-whitespace character (<literal>[^\s]</literal>)</para>
        </listitem>

        <listitem>
            <para><literal>\t</literal> matches a horizontal tab character (<literal>\u0009</literal>)</para>
        </listitem>

        <listitem>
            <para><literal>\w</literal> matches any word character (<literal>[a-zA-Z_0-9]</literal>)</para>
        </listitem>

        <listitem>
            <para><literal>\W</literal> matches any non-word character (<literal>[^\w]</literal>)</para>
        </listitem>

        <listitem>
            <para><literal>\\</literal> matches the backslash character (<quote>\</quote>)</para>
        </listitem>
    </itemizedlist>

    <bridgehead>Character Class Operator</bridgehead>

    <itemizedlist>
        <listitem>
            <para><literal>[<replaceable>abc</replaceable>]</literal> matches
            any character in the set <replaceable>a</replaceable>,
            <replaceable>b</replaceable> or <replaceable>c</replaceable>.
            A leading <quote>]</quote> will be interpreted literally.</para>
        </listitem>

        <listitem>
            <para><literal>[^<replaceable>abc</replaceable>]</literal> matches
            any character not in the set <replaceable>a</replaceable>,
            <replaceable>b</replaceable> or <replaceable>c</replaceable>.
            A leading <quote>]</quote> after the <quote>^</quote>
            will be interpreted literally.</para>
        </listitem>

        <listitem>
            <para><literal>[<replaceable>a-zA-Z</replaceable>]</literal> matches
            any character in the ranges <replaceable>a</replaceable> to
            <replaceable>z</replaceable> and <replaceable>A</replaceable> to
            <replaceable>Z</replaceable>, inclusive. A leading or trailing dash
            and a leading <quote>]</quote> will be interpreted literally.</para>
        </listitem>
    </itemizedlist>

    <bridgehead>Subexpressions and Backreferences</bridgehead>

    <itemizedlist>
        <listitem>
            <para><literal>(<replaceable>abc</replaceable>)</literal> matches
            whatever the expression <replaceable>abc</replaceable> would match,
            and saves it as a subexpression. Also used for grouping</para>
        </listitem>

        <listitem>
            <para><literal>(?<<replaceable>name</replaceable>><replaceable>abc</replaceable>)</literal>
            matches whatever the expression <replaceable>abc</replaceable> would match,
            and saves it as a subexpression called <replaceable>name</replaceable>.
            Also used for grouping</para>
        </listitem>

        <listitem>
            <para><literal>(?:<replaceable>...</replaceable>)</literal> pure
            grouping operator, does not save contents</para>
        </listitem>

        <listitem>
            <para><literal>(?=<replaceable>...</replaceable>)</literal> positive
            lookahead; the regular expression will match if the text in the
            brackets matches, but that text will not be considered part of the
            match</para>
        </listitem>

        <listitem>
            <para><literal>(?!<replaceable>...</replaceable>)</literal> negative
            lookahead; the regular expression will match if the text in the
            brackets does not match, and that text will not be considered part
            of the match</para>
        </listitem>

        <listitem>
            <para><literal>(?<=<replaceable>...</replaceable>)</literal> positive
            lookbehind; the regular expression will match if the text in the
            brackets matches, but that text will not be considered part of the
            match</para>
        </listitem>

        <listitem>
            <para><literal>(?<!<replaceable>...</replaceable>)</literal> negative
            lookbehind; the regular expression will match if the text in the
            brackets does not match, and that text will not be considered part
            of the match</para>
        </listitem>

        <listitem>
            <para><literal>(?><replaceable>...</replaceable>)</literal> pure
            possessive grouping operator, does not save contents and does not
            back off during backtracking</para>
        </listitem>

        <listitem>
            <para><literal>\<replaceable>n</replaceable></literal> where 0 <
            <replaceable>n</replaceable> < 10, matches the same thing the
            <replaceable>n</replaceable>th subexpression matched. Can only be
            used in the search string</para>
        </listitem>

        <listitem>
            <para><literal>$<replaceable>n</replaceable></literal> where 0 <
            <replaceable>n</replaceable> < 10, substituted with the text
            matched by the <replaceable>n</replaceable>th subexpression. Can
            only be used in the replacement string</para>
        </listitem>

        <listitem>
            <para><literal>\k<<replaceable>name</replaceable>></literal>,
            matches the same thing the subexpression called <replaceable>name</replaceable>
            matched. Can only be used in the search string</para>
        </listitem>

        <listitem>
            <para><literal>${<replaceable>name</replaceable>}</literal>,
            substituted with the text matched by the subexpression called <replaceable>name</replaceable>.
            Can only be used in the replacement string</para>
        </listitem>
    </itemizedlist>

    <bridgehead>Branching (Alternation) Operator</bridgehead>

    <itemizedlist>
        <listitem>
            <para><literal><replaceable>a</replaceable>|<replaceable>b</replaceable></literal>
            matches whatever the expression <replaceable>a</replaceable> would
            match, or whatever the expression <replaceable>b</replaceable> would
            match.</para>
        </listitem>
    </itemizedlist>

    <bridgehead>Repeating Operators</bridgehead>

    <para>These symbols operate on the previous atomic expression.</para>

    <itemizedlist>
        <listitem>
            <para><literal>?</literal> matches the preceding expression once or not at all</para>
        </listitem>

        <listitem>
            <para><literal>*</literal> matches the preceding expression zero or more times</para>
        </listitem>

        <listitem>
            <para><literal>+</literal> matches the preceding expression one or more times</para>
        </listitem>

        <listitem>
            <para><literal>{<replaceable>n</replaceable>}</literal>matches the preceding expression
            exactly <replaceable>n</replaceable> times</para>
        </listitem>

        <listitem>
            <para><literal>{<replaceable>n</replaceable>,<replaceable>m</replaceable>}</literal>
            matches the preceding expression between <replaceable>n</replaceable> and
            <replaceable>m</replaceable> times, inclusive</para>
        </listitem>

        <listitem>
            <para><literal>{<replaceable>n</replaceable>,}</literal> matches
            the preceding expression <replaceable>n</replaceable> or more times</para>
        </listitem>
    </itemizedlist>

    <bridgehead>Greedy, Reluctant and Possessive Matching</bridgehead>

    <para>If a repeating operator (above) is immediately followed by a
    <literal>?</literal>, it behaves reluctant, that is
    the repeating operator will stop at the smallest
    number of repetitions that can complete the rest of the match.</para>

    <para>If a repeating operator (above) is immediately followed by a
    <literal>+</literal>, it behaves possessive, that is
    the repeating operator will match as much characters as it can
    and will not back off during backtracking,
    even if that would allow to complete the rest of the match.</para>

    <para>If a repeating operator (above) is not immediately followed by a
    <literal>?</literal> or <literal>+</literal>, it behaves greedy, that is
    the repeating operator will match as much characters as it can
    but it will back off character by character during backtracking,
    if that would allow to complete the rest of the match.</para>

    <note>
        <title>On regex search</title>

        <para>There are some known issues with the
        <literal>java.util.regex</literal> library, as it stands in
        Java. In particular, it is possible to create
        regular expressions that hang the JVM, or cause stack overflow
        errors, which was not as easy to accomplish using the legacy
        <literal>gnu.regexp</literal> library. If you find that
        <literal>gnu.regexp</literal>, used in jEdit 4.2 and earlier, is
        more suitable for your search/replace needs, you can try the
        <emphasis role="bold">XSearch plugin</emphasis>, which still
        uses it and can provide a replacement to the built-in search
        dialog.</para>
    </note>

</appendix>

¤ Dauer der Verarbeitung: 0.20 Sekunden (vorverarbeitet) ¤

Download des Quellennavigators
Download des sprechenden Kalenders
in der Quellcodebibliothek suchen

Haftungshinweis

Die Informationen auf dieser Webseite wurden nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit, noch Qualität der bereit gestellten Informationen zugesichert.

Bemerkung:

Die farbliche Syntaxdarstellung ist noch experimentell.