Macaulay2 » Documentation
Packages » Macaulay2Doc > The Macaulay2 language > strings and nets > regular expressions > regex
next | previous | forward | backward | up | index | toc

regex -- evaluate a regular expression search

Synopsis

Description

The value returned is a list of pairs of integers corresponding to the parenthesized subexpressions successfully matched, suitable for use as the first argument of substring. The first member of each pair is the offset within str of the substring matched, and the second is the length.

See regular expressions for a brief introduction to the topic.

i1 : s = "The cat is black.";
i2 : m = regex("(\\w+) (\\w+) (\\w+)",s)

o2 = {(0, 10), (0, 3), (4, 3), (8, 2)}

o2 : List
i3 : substring(m#0, s)

o3 = The cat is
i4 : substring(m#1, s)

o4 = The
i5 : substring(m#2, s)

o5 = cat
i6 : substring(m#3, s)

o6 = is
i7 : s = "aa     aaaa";
i8 : m = regex("a+", 0, s)

o8 = {(0, 2)}

o8 : List
i9 : substring(m#0, s)

o9 = aa
i10 : m = regex("a+", 2, s)

o10 = {(7, 4)}

o10 : List
i11 : substring(m#0, s)

o11 = aaaa
i12 : m = regex("a+", 2, 3, s)
i13 : s = "line 1\nline 2\r\nline 3";
i14 : m = regex("^.*$", 8, -8, s)

o14 = {(7, 6)}

o14 : List
i15 : substring(m#0, s)

o15 = line 2
i16 : m = regex("^", 10, -10, s)

o16 = {(7, 0)}

o16 : List
i17 : substring(0, m#0#0, s)

o17 = line 1
i18 : substring(m#0#0, s)

o18 = line 2
      line 3
i19 : m = regex("^.*$", 4, -10, s)

o19 = {(0, 6)}

o19 : List
i20 : substring(m#0, s)

o20 = line 1
i21 : m = regex("a.*$", 4, -10, s)

By default, the regular expressions are interpreted using the Perl flavor, which supports features such as lookaheads and lookbehinds for fine-tuning the matches. This syntax is used in Perl and JavaScript languages.

i22 : regex("A(?!C)", "AC AB")

o22 = {(3, 1)}

o22 : List
i23 : regex("A(?=B)", "AC AB")

o23 = {(3, 1)}

o23 : List

Alternatively, one can choose the POSIX Extended flavor of regex using POSIX => true. This syntax is similar to the one used by the Unix utilities egrep and awk and enforces the leftmost, longest rule for finding matches. If there's a tie, the rule is applied to the first subexpression.

i24 : s = "<b>bold</b> and <b>strong</b>";
i25 : m = regex("<b>(.*)</b>", s, POSIX => true);
i26 : substring(m#1, s)

o26 = bold</b> and <b>strong

In the Perl flavor, one can specify whether repetitions should be possessive or non-greedy.

i27 : m = regex("<b>(.*?)</b>", s);
i28 : substring(m#1, s)

o28 = bold

See also

Ways to use regex:

For the programmer

The object regex is a method function with options.