Difference between revisions of "Regular Expressions"

Revision as of 18:58, 22 May 2017

A regular expression is a notation for defining all the valid strings of a formal language.

Regular Expression	Meaning
a	Matches a string consisting of just the symbol a
b	Matches a string consisting of just the symbol b
ab	Matches a string consisting of the symbol a followed by the symbol b
a*	Matches a string consisting of zero or more a’s
a+	Matches a string consisting of one or more a’s
abb?	Matches the string ab or the string abb. The ? symbol indicates zero or one of the preceding element
a\|b	Matches a string consisting of the symbol a or the symbol b

When using regular expressions, the rules of arithmetic precedence are as follows:

+ and * are done first

Concatenation (ie joining elements together) is done next

| comes last

Examples of regular expressions using the alphabet {a, b, c}

Symbol	Meaning	Example
│	Used to separate alternatives	a│b (Means a or b)
?	Used to denote zero or one of the preceding element	a? (0 or 1 as; matches with ‘’ & ‘a’)
*	Used to denote zero or more of the preceding element	a* (0 or more as; matches with ‘’, ‘a’, ‘aa’, etc.)
+	Used to denote one or more of the preceding element	a+ (1 or more as; matches with ‘a’, ‘aa”’etc.)
( )	Used to group characters together, to indicate the scope of another operator	(ab)* (Example 0 or more abs; matches with ‘’, ‘ab’, ‘abab’, etc.
[ ]	Another way of denoting alternatives (instead of vertical bar). Defines a character class	[ab] (means a or b)
\	The escape character (this turns the metacharacter into an ordinary character)	a\* (the a character followed by the * character. Note: \ is needed as a* would mean zero or more as.)
^	Used to indicate the negation of a character class. Also used to match the position before the first character in a string	a[^bc] (a followed by a character that is not a b or c) ^abc will match with abc only if it is at the beginning of a string
$	Used to match with the position after the last character in a string	abc$ (will match with abc only if it is at the end of a string)
.	Matches with any single character	a.a (will match with any string that has an a followed by any character followed by an a e.g. ‘aca’, ‘aba’)
-	Used to specify a range of values in a character class	[A-Z] (character in the range of A to Z)

@@ Line 51: / Line 51: @@
 |*	||Used to denote zero or more of the preceding element	||a* (0 or more as; matches with ‘’, ‘a’, ‘aa’, etc.)
 |-
-|+	||Used to denote one or more of the preceding element	||a+ (1 or more as; matches with ‘a’, ‘aa”’etc.)
+| +	||Used to denote one or more of the preceding element	||a+ (1 or more as; matches with ‘a’, ‘aa”’etc.)
 |-
 |( )	||Used to group characters together, to indicate the scope of another operator	||(ab)* (Example 0 or more abs; matches with ‘’, ‘ab’, ‘abab’, etc.