More on regular expressions

concept

Regular expressions are patterns (rules) used to match combinations of characters in a string. They are composed of common characters (such as a-Z letters) and special characters (also called metacaracters)

Regular Expression Usage

[Retrieve] Match data format (e.g., format validation at login, registration)
2. Replace to replace the text content (for example, some illegal characters, etc.)
Extract the specific part of a string that we want to extract (for example, a URL domain name or parameter)

RegExp syntax

Create a regular expression object using the constructor:const reg = new RegExp("pattern", modifiers)It is usually used when the subject of a regular match is uncertain
Use literals to create regular expression objects:const reg = /pattern/modifiersThis is a common way to write it

Object method for RegExp

Compile: Compiling regular expressions version 1.5 is deprecated
Exec: Retrieves the value specified in the string, returning the found value and determining its position
Test: Retrieves the presence of the specified value in the string, returning true or false
ToString: Returns the string of the regular expression

The RegExp object method uses the pose

Test is a common method that can be used to determine whether the pattern matches, such as form format verification, whether the specified illegal characters are included, and so on

Such as

# If it is a global match, Var reg = /lucky/g reg.test('I am lucky boy') var STR = /ab/g 'kkabkkabkk' a.test(STR) = true a.test(STR) = true Return true a.lastindex // 8 a.test(STR); return false a.lastindex // 0, Is reset to the initial value a.test(STR) // At this point the match starts from scratch again, returning 0Copy the code

Exec retrieves the specified value in the string and returns the values found (an array), or null if no match is found

const reg = /chen/
reg.exec('My name is chenjiaobin') 
# return ["chen", index: 11, input: "My name is chenjiaobin", groups: undefined]
Copy the code

The element 0 represents the text that matches the regular expression
Element 1 represents the first subexpressionreg = /(partten)/Matching text, if more than one, and so on, elements 3, 4…

// The regular expression of two subexpression, Const reg = /(Chen)(jiao)/ reg.exec('My name is chenjiaobin') // return ["chenjiao", "Chen ", "jiao", index: 11, input: "My name is chenjiaobin", groups: undefined]Copy the code

The index element is the position of the first character of the matching text
Input matches the retrieved string
The group is used to store information about the named capture group. It only has a value when the capture group is named, for example,

#? <first> sets the alias for the capture group const reg = /(? <first>chen)/ reg.exec('My name is chenjiaobin') # return ["chen", "chen", index: 11, input: "My name is chenjiaobin", groups: { first: 'chen' }]Copy the code

When the regular expression is set to global, the string is retrieved at the character specified in the lastIndex property of the RegExp.

When the corresponding text is matched, the lastIndex property of the RegExp is set to the index of the next position of the last character of the matched text until there is no text to match, returning null, and the lastIndex is reset to 0, so when we need to use the same regular expression over and over again, Reset lastIndex to 0 before matching the new string; Or instead of assigning an instantiated regular instance to a variable, use /partern/.exec(‘abcd’)

Var reg = /a/g reg. Exec ('abcdabcdabcd') # return ["a", index: 0, input: "abcdabcdabcd", groups: Return ["a", index: 4, input: "abcdabcdabcd", groups: Return ["a", index: 8, input: "abcdabcdabcd", groups: G var reg = /a/ reg. Exec ('abcdabcdabcd') index: 0, input: "abcdabcdabcd", groups: Undefined] # exec returns the same value as match if it is not global Such as' abcdabcdabcd. Match (/ a/g) / / return [' a ', 'a', 'a']Copy the code

String methods that support regular expressions

Search: Retrieves the value that matches the regular expression
Match: Matches one or more regular expressions are found
Replace: Replaces the string that matches the regular expression (without changing the original string)
Split: Split a string into an array of strings (without changing the original string)

The String method uses gestures

search

String.search (searchValue) # searchValue can be either a string or a regular expression, returning the first index if the match is above, or -1 otherwiseCopy the code

match

The behavior of this method depends heavily on whether the pattern is global or not, and returns null if not matched

Var reg = / Chen/'chenjiaochen'. Match (reg) # return [" Chen ", index: 0, input: Var reg = / Chen /g 'chenjiaochen'. Match (" Chen ", "Chen")Copy the code

replace

String. The replace (separator, STR | fn) / / the second parameter can be a string or anonymous functions, anonymous allows us to string matching to change anything, such as a const STR = 'he 20 years old this year, this year 22 years old, her 40 years old, his father 'const reg = /(\d+) yr /g //' const reg = /(\d+) yr /g // 'const reg = /(\d+) yr /g // In fact, the number of parameters is uncertain, Function formatAge (a, b, D) {const year = (new Date()).getFullYear() -parseint (a)-1 return a + '(' + year + ')'} STR. Replace (reg, He is 22 years old (born in 1998), she is 20 years old (born in 2000), his father is 40 years old (born in 1980), her father is 45 years old (born in 1975), "Var regex = /(\d{4})-(\d{2})-(\d{2})/; var string = "2017-06-12"; var result = string.replace(regex, function(match, year, month, day) { return month + "/" + day + "/" + year; }); console.log(result); / / = > "06/12/2017"Copy the code

split

Separator (separator, limit) # The first argument specifies the separator, which can be a string or regular expression, and the second argument specifies the maximum length of the array to be returnedCopy the code

The modifier

const reg = /^cheng? $/img

The modifier	meaning	describe
i	Ignore – Case insensitive	Set the matching characters to be case insensitive. A is the same as A
g	Global – Indicates a global match	Find all matches (note: use lastIndex to see if it was reset)
m	Multiline – Matches multiple lines	Make the boundary characters ^ and $match the beginning and end of each line, remember multiple lines, not the beginning and end of the entire string
s	Special character origin (.) Contains the newline character \n	By default, the dot matches any character other than the newline character \n. After adding the s modifier, then (.) Contains the newline \n character

M multi-line matching example:

Const STR = 'ABC \nabc\nabc' STR. Match (/^ ABC /g) // return [' ABC '] Match (/^ ABC /gm) // return [' ABC ', 'ABC ',' ABC '] modifier adds a multi-line matchCopy the code

An example of the s modifier:

Const STR = 'bei\nzuo\nsi' str.match(/bei./) // return null Returns the null STR. Match (/ bei. / s) / / return] [' bei,... A string is matched and an array is returnedCopy the code

Metacharacters (common)

character	describe
\	Mark the next character as a special or original character, such as \n for a newline character
^	Matches the start of the input string, and if the multiline property is set, ^ also matches the position after \n or \r, such as’ ABC \nab’. Match (/^ab/gm)’, returning [‘ab’,’ab’], or [‘ab’] if no m is present.
$	Matches the end of the string, or $before \n or \r if the multiline property is set
*	Matches the previous subexpression 0 or more times. For example, zo* matches “z” and “zoo”. * is equivalent to {0,}
+	Match the previous index one or more times. For example, ‘zo+’ matches “zo” and “zoo”, but not “z”. + is equivalent to {1,}
?	Matches the previous subexpression 0 or 1 times, such as “do(es)?” Can match “do” or “does”. ? Equivalent to {0, 1}
{n}	N is a non-negative integer that matches a certain n times, such as ‘o{2}’ does not match the ‘o’ in ‘Bob’, but matches the two o’s in ‘food’
{n,}	Matches at least n times, such as ‘o{2,}’ does not match the ‘o’ in ‘Bob’, but matches all the ‘o’ in ‘foooood’. ‘o{1,}’ is equivalent to ‘o+’. ‘o{0,}’ is equivalent to ‘o*’.
[xyz]	A set of characters that matches any character it contains. [up] can match the U in Lucky
[^xyz]	In contrast to [xyz], matches any character that does not contain a set. For example, [up] matches L, C, k, and y in Lucky
[a-z]	Match character range: matches any character in the specified range, including [A-c]
[^a-z]	With [a-z] instead
\b	Matches word boundaries, that is, the position between a word and a space. For example, er\b can match the er in nerver, or the er in ‘aer b’, but not the er in verb
\B	Matches non-word boundaries, ‘er\B’ matches the ‘er’ in verb, but not the ‘er’ in ‘never’
(\ \ d d instead)	Matches a numeric character, equivalent to [0-9]
(\ \ s s instead)	Matches any whitespace character, including Spaces, tabs, feed characters, etc., equivalent to [\f\n\r\t\v]
(\ \ w w instead)	Matches letters, digits, and underscores, equivalent to [a-za-Z0-9_]
(pattern)	Group capture, matches the pattern and retrieves the match. The Matches obtained can be obtained from the resulting Matches set, using the SubMatches set in VBScript and the SubMatches set in JScript $0…$ Nine attributes. To match parenthesis characters, use ‘(‘ or ‘)’.
.	Matches any single character except newline characters (\n, \r). To match any character, including \n, use ‘.

Expanding interpretation

? : If the character is followed by another qualifier (*, +,? , {n}, {n,}, {n,m}). The default greedy mode matches as much as possible, such as ooo, ‘o+? ‘will only match a single ‘o’, while ‘o+’ will match all ‘o’.
For some special characters ($, (), *, +,., [,? \ ^, {, |), if you want to match their needs to add escape characters \ n in front of them
+ and *Are greedy, they will probably match more words, we can do that by adding one after them, right? You can achieve non-greed or minimal matching. For example: match oh my God, re:> / <. * /If the re is /<.*? >/, then all that matches is the first one
The parentheses () indicate that the group is captured, which saves the matching values in the group. We can use this temporary cache to represent specific matches by \n, such as /([a-z]+) ABC \1/, instead of writing the rule ([a-z]+) again by \1. But using parentheses has the side effect of caching the relevant matches, so you can use? : Put in the first option to eliminate this side effect.

Var reg = /(Chen)jiaobin(Kevin)/ 'chenjiaobinkevinkk'. Replace (reg, '$1 test $2') $1 = $1; $2 = Kevin; $1 = $2; Var reg = /(? : Chen)jiaobin(Kevin)/ 'chenjiaobinkevinkk'. Replace (reg, '$1 test $2') Kevin tests $2kk # because the cache of the first parenthesis is removed, so $1 gets Kevin, and $2 doesn't get it, so it's printed as a string. : # added? Match (/(Chen) jiaobinchenkk \1/) # return ["chenjiaobinchen", "Chen ", index: 0, input: "chenjiaobinchenkk", groups: /(Chen)jiaobinchen/ 'chenjiaobinchenkk'. Match (/(? : Chen)jiaobin\1/) # returns null, the cache is removed, \1 represents itselfCopy the code

- exp1(? =exp2) : find exp1 before exp2 (e.g. ‘ageoldageyear’. Match (/age(? =year)/;
- (? <=exp2)exp1: Find exp1 after exp2
- exp1(? ! Exp2) : find exp1 that does not follow exp2
- (?

Operator precedence

The following priorities are ranked from top to bottom

\ : Escape character () (? :) (? =) [] : parentheses and square brackets *, +,? , {n} {n,}, {n, m} : qualifier ^ and $, \ any metacharacters, any character: anchor point | : or operationCopy the code

Priority problems tend to be rare, but still need to know, such as m | food will match the m or the food, rather than the mood or food, because the character more priority than “or” operation, it can be changed (m) | f ood was no problem

Matching rules for common regular scenarios

Chinese characters: ^ [\ u4e00 - \ u9fa5] {0} $id (15 or 17 + 18 or check digit X X) : (^ \ d {15}) | | (^ \ d {and}) (^ \ d {and} (\ d | | X) X $) email check: ^ \ w + (\ w + / - +.]) * @ \ w + ([-] \ w +) * \ \ w + ([-] \ w +) * $mobile number: ^ (13 14 [579] [0-9] | | 15 [0, 3, 5-9] 16 [6] | | 17 [0135678] [0-9] | | 18 19 [89]) \ d {8} $Copy the code

Re validates the viewer

Regulex: jex. Im/Regulex / #! F… (recommended)
Tools for beginners: c.runoob.com/front-end/8…

Regular related articles

Novice tutorial: www.runoob.com/regexp/rege…
CSDN article: blog.csdn.net/whitegay/ar…
RegExp objects and support regular String methods: www.runoob.com/jsref/jsref…
Nuggets (Yao) juejin.cn/post/684490…
Regular expression ebook download