syntax.txt revision 2a99a7e74a7f215066514fe81d2bfa6639d9eddd
1RE2 regular expression syntax reference
2-------------------------­-------­-----
3
4Single characters:
5.	any character, possibly including newline (s=true)
6[xyz]	character class
7[^xyz]	negated character class
8\d	Perl character class
9\D	negated Perl character class
10[:alpha:]	ASCII character class
11[:^alpha:]	negated ASCII character class
12\pN	Unicode character class (one-letter name)
13\p{Greek}	Unicode character class
14\PN	negated Unicode character class (one-letter name)
15\P{Greek}	negated Unicode character class
16
17Composites:
18xy	«x» followed by «y»
19x|y	«x» or «y» (prefer «x»)
20
21Repetitions:
22x*	zero or more «x», prefer more
23x+	one or more «x», prefer more
24x?	zero or one «x», prefer one
25x{n,m}	«n» or «n»+1 or ... or «m» «x», prefer more
26x{n,}	«n» or more «x», prefer more
27x{n}	exactly «n» «x»
28x*?	zero or more «x», prefer fewer
29x+?	one or more «x», prefer fewer
30x??	zero or one «x», prefer zero
31x{n,m}?	«n» or «n»+1 or ... or «m» «x», prefer fewer
32x{n,}?	«n» or more «x», prefer fewer
33x{n}?	exactly «n» «x»
34x{}	(== x*) NOT SUPPORTED vim
35x{-}	(== x*?) NOT SUPPORTED vim
36x{-n}	(== x{n}?) NOT SUPPORTED vim
37x=	(== x?) NOT SUPPORTED vim
38
39Possessive repetitions:
40x*+	zero or more «x», possessive NOT SUPPORTED
41x++	one or more «x», possessive NOT SUPPORTED
42x?+	zero or one «x», possessive NOT SUPPORTED
43x{n,m}+	«n» or ... or «m» «x», possessive NOT SUPPORTED
44x{n,}+	«n» or more «x», possessive NOT SUPPORTED
45x{n}+	exactly «n» «x», possessive NOT SUPPORTED
46
47Grouping:
48(re)	numbered capturing group
49(?P<name>re)	named & numbered capturing group
50(?<name>re)	named & numbered capturing group NOT SUPPORTED
51(?'name're)	named & numbered capturing group NOT SUPPORTED
52(?:re)	non-capturing group
53(?flags)	set flags within current group; non-capturing
54(?flags:re)	set flags during re; non-capturing
55(?#text)	comment NOT SUPPORTED
56(?|x|y|z)	branch numbering reset NOT SUPPORTED
57(?>re)	possessive match of «re» NOT SUPPORTED
58re@>	possessive match of «re» NOT SUPPORTED vim
59%(re)	non-capturing group NOT SUPPORTED vim
60
61Flags:
62i	case-insensitive (default false)
63m	multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
64s	let «.» match «\n» (default false)
65U	ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
66Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
67
68Empty strings:
69^	at beginning of text or line («m»=true)
70$	at end of text (like «\z» not «\Z») or line («m»=true)
71\A	at beginning of text
72\b	at word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
73\B	not a word boundary
74\G	at beginning of subtext being searched NOT SUPPORTED pcre
75\G	at end of last match NOT SUPPORTED perl
76\Z	at end of text, or before newline at end of text NOT SUPPORTED
77\z	at end of text
78(?=re)	before text matching «re» NOT SUPPORTED
79(?!re)	before text not matching «re» NOT SUPPORTED
80(?<=re)	after text matching «re» NOT SUPPORTED
81(?<!re)	after text not matching «re» NOT SUPPORTED
82re&	before text matching «re» NOT SUPPORTED vim
83re@=	before text matching «re» NOT SUPPORTED vim
84re@!	before text not matching «re» NOT SUPPORTED vim
85re@<=	after text matching «re» NOT SUPPORTED vim
86re@<!	after text not matching «re» NOT SUPPORTED vim
87\zs	sets start of match (= \K) NOT SUPPORTED vim
88\ze	sets end of match NOT SUPPORTED vim
89\%^	beginning of file NOT SUPPORTED vim
90\%$	end of file NOT SUPPORTED vim
91\%V	on screen NOT SUPPORTED vim
92\%#	cursor position NOT SUPPORTED vim
93\%'m	mark «m» position NOT SUPPORTED vim
94\%23l	in line 23 NOT SUPPORTED vim
95\%23c	in column 23 NOT SUPPORTED vim
96\%23v	in virtual column 23 NOT SUPPORTED vim
97
98Escape sequences:
99\a	bell (== \007)
100\f	form feed (== \014)
101\t	horizontal tab (== \011)
102\n	newline (== \012)
103\r	carriage return (== \015)
104\v	vertical tab character (== \013)
105\*	literal «*», for any punctuation character «*»
106\123	octal character code (up to three digits)
107\x7F	hex character code (exactly two digits)
108\x{10FFFF}	hex character code
109\C	match a single byte even in UTF-8 mode
110\Q...\E	literal text «...» even if «...» has punctuation
111
112\1	backreference NOT SUPPORTED
113\b	backspace NOT SUPPORTED (use «\010»)
114\cK	control char ^K NOT SUPPORTED (use «\001» etc)
115\e	escape NOT SUPPORTED (use «\033»)
116\g1	backreference NOT SUPPORTED
117\g{1}	backreference NOT SUPPORTED
118\g{+1}	backreference NOT SUPPORTED
119\g{-1}	backreference NOT SUPPORTED
120\g{name}	named backreference NOT SUPPORTED
121\g<name>	subroutine call NOT SUPPORTED
122\g'name'	subroutine call NOT SUPPORTED
123\k<name>	named backreference NOT SUPPORTED
124\k'name'	named backreference NOT SUPPORTED
125\lX	lowercase «X» NOT SUPPORTED
126\ux	uppercase «x» NOT SUPPORTED
127\L...\E	lowercase text «...» NOT SUPPORTED
128\K	reset beginning of «$0» NOT SUPPORTED
129\N{name}	named Unicode character NOT SUPPORTED
130\R	line break NOT SUPPORTED
131\U...\E	upper case text «...» NOT SUPPORTED
132\X	extended Unicode sequence NOT SUPPORTED
133
134\%d123	decimal character 123 NOT SUPPORTED vim
135\%xFF	hex character FF NOT SUPPORTED vim
136\%o123	octal character 123 NOT SUPPORTED vim
137\%u1234	Unicode character 0x1234 NOT SUPPORTED vim
138\%U12345678	Unicode character 0x12345678 NOT SUPPORTED vim
139
140Character class elements:
141x	single character
142A-Z	character range (inclusive)
143\d	Perl character class
144[:foo:]	ASCII character class «foo»
145\p{Foo}	Unicode character class «Foo»
146\pF	Unicode character class «F» (one-letter name)
147
148Named character classes as character class elements:
149[\d]	digits (== \d)
150[^\d]	not digits (== \D)
151[\D]	not digits (== \D)
152[^\D]	not not digits (== \d)
153[[:name:]]	named ASCII class inside character class (== [:name:])
154[^[:name:]]	named ASCII class inside negated character class (== [:^name:])
155[\p{Name}]	named Unicode property inside character class (== \p{Name})
156[^\p{Name}]	named Unicode property inside negated character class (== \P{Name})
157
158Perl character classes:
159\d	digits (== [0-9])
160\D	not digits (== [^0-9])
161\s	whitespace (== [\t\n\f\r ])
162\S	not whitespace (== [^\t\n\f\r ])
163\w	word characters (== [0-9A-Za-z_])
164\W	not word characters (== [^0-9A-Za-z_])
165
166\h	horizontal space NOT SUPPORTED
167\H	not horizontal space NOT SUPPORTED
168\v	vertical space NOT SUPPORTED
169\V	not vertical space NOT SUPPORTED
170
171ASCII character classes:
172[:alnum:]	alphanumeric (== [0-9A-Za-z])
173[:alpha:]	alphabetic (== [A-Za-z])
174[:ascii:]	ASCII (== [\x00-\x7F])
175[:blank:]	blank (== [\t ])
176[:cntrl:]	control (== [\x00-\x1F\x7F])
177[:digit:]	digits (== [0-9])
178[:graph:]	graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
179[:lower:]	lower case (== [a-z])
180[:print:]	printable (== [ -~] == [ [:graph:]])
181[:punct:]	punctuation (== [!-/:-@[-`{-~])
182[:space:]	whitespace (== [\t\n\v\f\r ])
183[:upper:]	upper case (== [A-Z])
184[:word:]	word characters (== [0-9A-Za-z_])
185[:xdigit:]	hex digit (== [0-9A-Fa-f])
186
187Unicode character class names--general category:
188C	other
189Cc	control
190Cf	format
191Cn	unassigned code points NOT SUPPORTED
192Co	private use
193Cs	surrogate
194L	letter
195LC	cased letter NOT SUPPORTED
196L&	cased letter NOT SUPPORTED
197Ll	lowercase letter
198Lm	modifier letter
199Lo	other letter
200Lt	titlecase letter
201Lu	uppercase letter
202M	mark
203Mc	spacing mark
204Me	enclosing mark
205Mn	non-spacing mark
206N	number
207Nd	decimal number
208Nl	letter number
209No	other number
210P	punctuation
211Pc	connector punctuation
212Pd	dash punctuation
213Pe	close punctuation
214Pf	final punctuation
215Pi	initial punctuation
216Po	other punctuation
217Ps	open punctuation
218S	symbol
219Sc	currency symbol
220Sk	modifier symbol
221Sm	math symbol
222So	other symbol
223Z	separator
224Zl	line separator
225Zp	paragraph separator
226Zs	space separator
227
228Unicode character class names--scripts:
229Arabic	Arabic
230Armenian	Armenian
231Balinese	Balinese
232Bengali	Bengali
233Bopomofo	Bopomofo
234Braille	Braille
235Buginese	Buginese
236Buhid	Buhid
237Canadian_Aboriginal	Canadian Aboriginal
238Carian	Carian
239Cham	Cham
240Cherokee	Cherokee
241Common	characters not specific to one script
242Coptic	Coptic
243Cuneiform	Cuneiform
244Cypriot	Cypriot
245Cyrillic	Cyrillic
246Deseret	Deseret
247Devanagari	Devanagari
248Ethiopic	Ethiopic
249Georgian	Georgian
250Glagolitic	Glagolitic
251Gothic	Gothic
252Greek	Greek
253Gujarati	Gujarati
254Gurmukhi	Gurmukhi
255Han	Han
256Hangul	Hangul
257Hanunoo	Hanunoo
258Hebrew	Hebrew
259Hiragana	Hiragana
260Inherited	inherit script from previous character
261Kannada	Kannada
262Katakana	Katakana
263Kayah_Li	Kayah Li
264Kharoshthi	Kharoshthi
265Khmer	Khmer
266Lao	Lao
267Latin	Latin
268Lepcha	Lepcha
269Limbu	Limbu
270Linear_B	Linear B
271Lycian	Lycian
272Lydian	Lydian
273Malayalam	Malayalam
274Mongolian	Mongolian
275Myanmar	Myanmar
276New_Tai_Lue	New Tai Lue (aka Simplified Tai Lue)
277Nko	Nko
278Ogham	Ogham
279Ol_Chiki	Ol Chiki
280Old_Italic	Old Italic
281Old_Persian	Old Persian
282Oriya	Oriya
283Osmanya	Osmanya
284Phags_Pa	'Phags Pa
285Phoenician	Phoenician
286Rejang	Rejang
287Runic	Runic
288Saurashtra	Saurashtra
289Shavian	Shavian
290Sinhala	Sinhala
291Sundanese	Sundanese
292Syloti_Nagri	Syloti Nagri
293Syriac	Syriac
294Tagalog	Tagalog
295Tagbanwa	Tagbanwa
296Tai_Le	Tai Le
297Tamil	Tamil
298Telugu	Telugu
299Thaana	Thaana
300Thai	Thai
301Tibetan	Tibetan
302Tifinagh	Tifinagh
303Ugaritic	Ugaritic
304Vai	Vai
305Yi	Yi
306
307Vim character classes:
308\i	identifier character NOT SUPPORTED vim
309\I	«\i» except digits NOT SUPPORTED vim
310\k	keyword character NOT SUPPORTED vim
311\K	«\k» except digits NOT SUPPORTED vim
312\f	file name character NOT SUPPORTED vim
313\F	«\f» except digits NOT SUPPORTED vim
314\p	printable character NOT SUPPORTED vim
315\P	«\p» except digits NOT SUPPORTED vim
316\s	whitespace character (== [ \t]) NOT SUPPORTED vim
317\S	non-white space character (== [^ \t]) NOT SUPPORTED vim
318\d	digits (== [0-9]) vim
319\D	not «\d» vim
320\x	hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
321\X	not «\x» NOT SUPPORTED vim
322\o	octal digits (== [0-7]) NOT SUPPORTED vim
323\O	not «\o» NOT SUPPORTED vim
324\w	word character vim
325\W	not «\w» vim
326\h	head of word character NOT SUPPORTED vim
327\H	not «\h» NOT SUPPORTED vim
328\a	alphabetic NOT SUPPORTED vim
329\A	not «\a» NOT SUPPORTED vim
330\l	lowercase NOT SUPPORTED vim
331\L	not lowercase NOT SUPPORTED vim
332\u	uppercase NOT SUPPORTED vim
333\U	not uppercase NOT SUPPORTED vim
334\_x	«\x» plus newline, for any «x» NOT SUPPORTED vim
335
336Vim flags:
337\c	ignore case NOT SUPPORTED vim
338\C	match case NOT SUPPORTED vim
339\m	magic NOT SUPPORTED vim
340\M	nomagic NOT SUPPORTED vim
341\v	verymagic NOT SUPPORTED vim
342\V	verynomagic NOT SUPPORTED vim
343\Z	ignore differences in Unicode combining characters NOT SUPPORTED vim
344
345Magic:
346(?{code})	arbitrary Perl code NOT SUPPORTED perl
347(??{code})	postponed arbitrary Perl code NOT SUPPORTED perl
348(?n)	recursive call to regexp capturing group «n» NOT SUPPORTED
349(?+n)	recursive call to relative group «+n» NOT SUPPORTED
350(?-n)	recursive call to relative group «-n» NOT SUPPORTED
351(?C)	PCRE callout NOT SUPPORTED pcre
352(?R)	recursive call to entire regexp (== (?0)) NOT SUPPORTED
353(?&name)	recursive call to named group NOT SUPPORTED
354(?P=name)	named backreference NOT SUPPORTED
355(?P>name)	recursive call to named group NOT SUPPORTED
356(?(cond)true|false)	conditional branch NOT SUPPORTED
357(?(cond)true)	conditional branch NOT SUPPORTED
358(*ACCEPT)	make regexps more like Prolog NOT SUPPORTED
359(*COMMIT)	NOT SUPPORTED
360(*F)	NOT SUPPORTED
361(*FAIL)	NOT SUPPORTED
362(*MARK)	NOT SUPPORTED
363(*PRUNE)	NOT SUPPORTED
364(*SKIP)	NOT SUPPORTED
365(*THEN)	NOT SUPPORTED
366(*ANY)	set newline convention NOT SUPPORTED
367(*ANYCRLF)	NOT SUPPORTED
368(*CR)	NOT SUPPORTED
369(*CRLF)	NOT SUPPORTED
370(*LF)	NOT SUPPORTED
371(*BSR_ANYCRLF)	set \R convention NOT SUPPORTED pcre
372(*BSR_UNICODE)	NOT SUPPORTED pcre
373
374