1. Create regular expressions

There are two ways to create regular expressions

  1. Through/ABC /, only static re is supported
  2. Dynamic re is supported with new RegExp(‘ ABC ‘)

The second, using the new RegExp() mode, is recommended and has the advantage of dynamic regex, essentially because strings can be concatenated


2. Methods of regular expressions

Before getting into the specifics of the regular configuration, it’s important to understand what each of the regular methods does to help you understand that there are two data types that re can be used for

  1. RegExp object method
  2. String method

1. The regexp-test method

Concept: Reg’s test method, which verifies that the re matches, returns a Boolean value

const reg = /A/;
reg.test('BA'); // true
Copy the code

Note: Test can be called multiple times if the mode is set to /g

const reg = /A/g;
const str = 'A B A C';
reg.test(str); // true;
reg.lastIndex  / / 1
reg.test(str); // true;
reg.lastIndex  / / 5
reg.test(str); // false;
reg.lastIndex  / / 0
Copy the code
  1. Reg’s test method matches a different location each time
  2. LastIndex means that subsequent matches will start after lastIndex
  3. Reg returns false for the third match and resets lastIndex

2. The regexp-exec method

Concept: The exec method of reg matches the matching item of a string every time it is called, facilitating process control

const reg = /AB/g;
const str = 'DCAB ABCD CABD'
reg.exec(str)  // ["AB", index: 2, input: "DCAB ABCD CABD", groups: undefined]
reg.lastIndex; / / 4
reg.exec(str)  // ["AB", index: 5, input: "DCAB ABCD CABD", groups: undefined]
reg.lastIndex; / / 7
reg.exec(str)  // ["AB", index: 11, input: "DCAB ABCD CABD", groups: undefined]
reg.lastIndex; / / 13
reg.exec(str)  // null
reg.lastIndex; / / 0
Copy the code
  1. Reg is set to AB, and global G matches
  2. Reg match STR =’DCAB ABCD CABD’;
    1. Returns the result, including the match value, subscript, etc
    2. The lastIndex of the reg object, updated to the starting position 4 of the next match
  3. Reg matches STR again
    1. Return the result again
    2. The lastIndex of the reg object, updated to the starting position 7 of the next match
  4. Reg matches STR again
    1. Return the result again
    2. The lastIndex of the reg object, updated to 13 at the start of the next match
  5. Reg matches STR again
    1. No match, return null
    2. The lastIndex of the reg object is reset to 0

There are only two RegExp methods, both of which can be called over and over again

  1. Normal use of test is sufficient
  2. To refine operations, use exec

3. String-search

Concept: STR verifies that a regular pattern is matched by search

const reg = /A/;
'AB'.search(reg);    / / 0;
'BA'.search(reg);    / / 1.
'C'.search(reg);     / / - 1;
---
'AB AB'.search(reg); / / 0;
'AB AB'.search(reg); / / 0;
Copy the code
  1. The string is the body
  2. AB checks to see if it matches A and gets subscript 0;
  3. BA checks to see if it matches A and gets subscript 1;
  4. C check whether A matches A, not found, return -1;
  5. AB AB checks to see if it matches A and gets subscript 0;
  6. AB AB checks whether it matches A and still gets the first retrieved subscript 0;

Search cannot perform an iterative query and cannot determine how many matches there are. The use scenarios are limited:

  1. You just need to know if it matches (test does)
  2. And get the first index that matches (this is better than test, but test can also calculate index from lastIndex)

4. String-match method

Concept: STR’s match matches elements that match the re

const reg = /A/;
'AB'.match(reg); // ["A", index: 0, input: "AB", groups: undefined]
'BA'.match(reg); // ["A", index: 1, input: "BA", groups: undefined]
---
const str = 'AB AB';
str.match(reg);  // ["A", index: 0, input: "AB AB", groups: undefined]
str.match(reg);  // ["A", index: 0, input: "AB AB", groups: undefined]
---
Copy the code
  1. Reg sets to match A
  2. AB matches, matches A, and returns information about the result
  3. BA matches, matches to A, and returns information about the result
  4. Create STR containing duplicate A
  5. STR matches the first “A” and returns information about the result
  6. STR matches, or matches the first A, and returns information about the result

The match method, which does not do an iterative query, only returns the first result of the match, and returns the same value as exec. One thing to note is that if you set the global match to /g

const reg = /A/g;
'AB AB'.match(reg); // ["A", "A"]
Copy the code
  1. Reg Matches A globally
  2. STR matches and returns a set of all matching results
  3. Note: This result set does not contain subscript information, only match nodes
    1. It can only be used to quickly get how many matches there are

5. String-matchall method

The disadvantage of match, as I mentioned earlier, is that it’s not easy to get multiple matches and matchAll is a quick way to solve that problem, because matchAll returns an iterator object, and I’m not going to expand on the iterator

const reg = /A/g;
[...'AB AB'.matchAll(reg)];
// Output is as follows:
/ / /
// ["A", index: 0, input: "AB AB", groups: undefined],
// ["A", index: 3, input: "AB AB", groups: undefined]
// ]
Copy the code

As you can see, matchAll gives you all the information and combining for of also solves the problem of exec writing, so you don’t need the while+exec loop anymore

const reg = /A/g;
const matchs = 'AB AB'.matchAll(reg);
for(let match of matchs){
    console.log(match);
}
// ["A", index: 0, input: "AB AB", groups: undefined]
// ["A", index: 3, input: "AB AB", groups: undefined]
Copy the code

Note that matchAll must be defined in such a way that the pattern is /g global, otherwise an error will be reported


6. String-replace method

The replace method is quite common, but it still needs to be corrected

str.replace(regexp|substr, newSubStr|function)
Copy the code

The replace method takes two arguments

  1. Parameter 1: Can be either a regular or a fixed string
  2. Argument 2: Can be a new character or a callback function
'AB'.replace('A'.'C')        // CB
'AB'.replace(/A/.'C')        // CB
'AB'.replace('A'.() = >'C')    // CB

---
'ABA'.replace('A'.'C')       // CBA
'ABA'.replace('A'.'C').replace('A'.'C')  // CBC
Copy the code
  1. The static string A->C is supported
  2. Support for regular substitutions in the form of reg
  3. Support function dynamic processing
  4. Each replace can only handle the first match

Special operations to replace strings

'AB'.replace(/A/.'C')      // CB
'AB'.replace('A'.() = >'C')  // CB
Copy the code

In the example above, the second argument can be a string/function. Obviously, the function is more customized, and the string has less space to operate. In order to solve the problem that the string is not easy to operate quickly, the second string object also has a number of shortcuts, all of which are based on $


6.1 The dollar sign

It doesn’t make a lot of sense, just to get the idea, but it’s basically replacing two dollar signs with one

'AB'.replace('A'.'$$')  // $B
'AB'.replace('A'.'$$$') // $$B
'AB'.replace('A'.'$$$$') // $$B
Copy the code
  1. Match A, replace with A dollar sign, find two $$, replace with A $
  2. $$= $$; $$= $$; $$= $$;
  3. $$; $$; $$; $$; $$; $$

Above 👆 this understanding is good, need not thorough


6.2 $& The last matched string

Concept: $& represents the result of a match

'AB'.replace(/A/.'C')  // CB
RegExp.lastMatch       // A
RegExp['$&']           // A
'AB'.replace(/A/.'$&') // AB
Copy the code
  1. AB matches A through the re, replaces it with C, and prints CB
  2. Note: the value of lastMatch on the global object RegExp is updated to match the result A
  3. And [$&] on RegExp is changed to A
  4. Using $& in the argument string has the same effect

Conclusion: Not recommended, because the RegExp is a global object, any regular firing will affect the RegExp object, and the security is very low


6.3 $’ The content on the left of the final match result

Concept: Similar to the above concept, but changed to the left of the matching result

'CBA'.replace(/A/.'D');  // CBD
RegExp.leftContext;      // CB
RegExp['$`];            // CB
'CBA'.replace(/A/.'$`); // CBCB
Copy the code
  1. CBA is matched by the re /A/, replacing the content with D, and the result is CBD
  2. Note: the value of leftContext on the global object RegExp is updated to match CB to the left of result A
  3. [$’] on RegExp is also changed to CB
  4. Using $’ in the argument string outputs the result, CBCB

6.4 $’ Content to the right of the final match

Concept: The opposite of the above

'ABC'.replace(/A/.'D');   //DBC
RegExp.rightContext       //BC
RegExp['$\' ']             //BC
'ABC'.replace(/A/.'$\' '); //BCBC
Copy the code
  1. ABC matches with the regular match /A/ and replaces the content with D, resulting in DBC
  2. Note: The value of rightContext on the global object RegExp is updated to match BC to the right of result A
  3. [$’] on RegExp is also changed to BC
  4. Using $’ in the argument string outputs the resulting BCBC

6.5 $N bracket results match

Concept: What is wrapped in regular brackets is mounted on $n Note: the subscript starts with 1

'AB'.replace(/(A)/g.'C'); // CB
RegExp$1.// A
RegExp$2./ /"

---
'ABC'.replace(/(A).(C)/g.' ');
RegExp$1.// A
RegExp$2.// C
RegExp$3./ /"
Copy the code
  1. The prerequisite is that it has to be wrapped in parentheses, which we’ll talk about later
  2. As you can see, the first re matches only one parenthesis, so only $1 has a value
  3. The second re has two parentheses
    1. Matches 1 and 1 and 1 and 2

6.6 $Group result TODO


3. Pattern of the regular expression

Also divided into two types:

  1. Simple modes, such as/ABC /
  2. Special character modes, for example, /ab*c/

The special character patterns are the ones we need to focus on, which will be explained in more detail next


4. Simple mode

const reg = /abc/;
console.log(reg.test('a'));   // false
console.log(reg.test('ab'));  // false
console.log(reg.test('abc')); // true
console.log(reg.test('abcd'));// true
Copy the code

Simple mode is really simple, it’s a strong match, true if you have it, false if you don’t


5. Special character mode

There are many special symbols of regularity, but there are also categories, which fall into five main categories:

  1. assertions
  2. Character classes
  3. Grouping and range
  4. quantifiers
  5. Unicode escape

5.1. Assertion classes

5.1.1 ^ Matches start

Concept: ^ matches the beginning

const reg = /^A/;
reg.test('AB'); //true
reg.test('BA'); //false
Copy the code

It’s easy to understand if you’re going to start with a certain character, but one thing you need to be aware of is if you’re going to have multiple lines of text

const reg = /^A/;
reg.test(`B
A`); // false;


// The regular mode can be turned on by /m
const reg = /^A/m;
reg.test(`B
A`);// true
Copy the code

There’s a /m here, and that’s just to tell you whether or not you know that m is a modifier for multiple lines, but I’m not going to expand it, because there’s a lot of space to talk about modifiers


5.1.2 $Match end

Concept: $matches the beginning

const reg = /A$/;
reg.test('BA') // true
reg.test('AB') // false
Copy the code

There is also a multi-line pattern match

const reg = /B$/;
reg.test(`B
A`) // false;

// The regular mode can be turned on by /m
const reg = /B$/m;
reg.test(`B
A`) // true;
Copy the code

5.1.3 \b Matches word boundaries

The concept: \b is used to match word boundaries, which is a bit confusing. See the following example

const reg = /A\b/;
reg.test('AB')      // false
reg.test('BA')      // true
reg.test('ABB')     // false;
reg.test('BA abcd') // true

Copy the code
  1. Reg is set so that the right side of character A is the word boundary
  2. AB, the right side of A is not A boundary, it’s B, so false
  3. BA, the right side of A is the boundary, so true
  4. ABB, the right side of A is not A boundary, it’s B, so false
  5. BA abcd, A is bounded space to the right, so true

5.1.4 \B Matches words that are not edges

Concept: above \b is used to match edges of words, while \b is used specifically to match non-edges

const reg = /A\B/;
reg.test('AB'); // true
reg.test('BA'); // false
reg.test('BAA');// true
Copy the code
  1. Reg is set so that the right side of character A is A non-word boundary
  2. AB, the right side of A is not A boundary, it’s B, so true
  3. BA, the right side of A is the boundary, so false
  4. BAA, the right side of A is not A boundary, it’s A, so true

5.1.5 x (? =y) Forward assertion (practical)

The forward assertion means that the match method ends up with an x, not a y, so it’s called the forward assertion concept: x is followed by y to be matched, very useful

const reg = /A(? =B)/;
reg.test('AB'); // true
reg.test('BA'); // false
reg.test('ABC');// true
reg.test('ACB');// false
Copy the code
  1. Reg is set to, the character A must be followed by B
  2. AB, A is right next to B, so true
  3. BA, A is not immediately to the right of B, so false
  4. ABC, A is immediately to the right of B, so true
  5. ACB, A is not immediately to the right of B, so false

5.1.6 x (? ! Y) Forward negative assertion (practical)

const reg = /A(? ! B)/
reg.test('AB')  // false
reg.test('BA')  // true
reg.test('ABC') // false
reg.test('ACB') // true
Copy the code
  1. Reg is set to, A must not be followed by B
  2. AB, A is immediately to the right of B, so false
  3. BA, A doesn’t immediately follow B to the right, so true
  4. ABC, A is immediately to the right of B, so false
  5. ACB, A is not immediately to the right of B, so true

5.1.7 (? <=y)x backward assertion (practical)

Concept: x must be preceded by y, as in the preceding assertion, based on the position of x:

  1. Predicate forward, predicate condition after x(? =y)
  2. Predicate backward, judging the condition before (? <=y)x
const reg = / (? <=A)B/;
reg.test('AB')  // true
reg.test('BA')  // false
reg.test('ABC') // true
reg.test('ACB') // false
Copy the code
  1. Reg is set to B, which must be preceded by A, and notice that the body of the judgment is replaced by B
  2. AB, B is preceded by A, so true
  3. BA, B doesn’t have A in front of it, so false
  4. ABC, B with A in front of it, so true
  5. ACB, B is not in front of A, so it’s false

5.1.8 (? <! Y)x backward negate assertion

Concept: x must not be preceded by y

const reg = / (? 
      ;
reg.test('AB')
reg.test('BA')
reg.test('ABC')
reg.test('ACB')
Copy the code
  1. Reg is set to B, which must not be preceded by A, and notice that the body of the judgment is replaced by B
  2. AB, B has A in front of it, so false
  3. BA, B is not preceded by A, so true
  4. ABC, B has A in front of it, so false
  5. ACB, B is not in front of A, so true

5.2 character classes

Character classes are not the same as assertions, which determine boundaries, whereas characters primarily determine what the content is

5.2.1. Matches any character except the newline character

const reg = /./g
reg.exec('AB'); // ["A", index: 0, input: "AB", groups: undefined]
reg.exec('AB'); // ["B", index: 1, input: "AB", groups: undefined]
reg.exec('AB'); // null
reg.exec(`
`); // null
Copy the code
  1. Reg is set to global mode
  2. Reg match.So you’re going to get an A first
  3. To B
  4. No match, return null
  5. Return null for newline character not available

5.2.2 \d Matches any number

Concept: Matches any number from 0 to 9

const reg = /\d/;
'1A'.match(reg); // ["1", index: 0, input: "1A", groups: undefined]
Copy the code
  1. Set reg to match any number
  2. 1A Matches, returns result 1, subscript 0, enter 1A

5.2.3 \D Matches any non-number

Concept: Matches any character that is not 0-9

const reg = /\D/;
'1A'.match(reg);
["A".index: 1.input: "1A".groups: undefined]
Copy the code
  1. Set reg to match any non-number
  2. 1A After matching, the result A is returned with subscript 1. Enter 1A

5.2.4 \w Match any Latin letters, including letters/numbers/underscores (to be memorated)

Concept:

  1. Matches any lowercase letters a to Z
  2. Matches any uppercase letters A-Z
  3. Match any number
  4. Match underscore _
const reg = /\w/;
reg.test('a') // true;
reg.test('A') // true;
reg.test('1') // true;
reg.test('_') // true;
---
reg.test(The '%') // false;
Copy the code
  1. \w matches the lowercase a
  2. \w matches capital A
  3. \w matches an underscore
  4. \w does not match %

5.2.5 \W matches any non-Latin letter

Concept: the inverse of \w,

  1. Matches any non-lowercase letter
  2. Matches any non-capital letters
  3. Matches any non-number
  4. Matches any other than _
const reg = /\W/;
reg.test('a')     // false;
reg.test('A')     // false;
reg.test('1')     // false;
reg.test('_')     // false;
---
reg.test(The '%')     // true;
reg.test(The '-')     // true;
Copy the code

5.2.6 \s Matches any whitespace characters, including newline, space, TAB, etc

Concept: whitespace characters

  1. Match any space
  2. Matches any newline character
  3. Match TAB switch
  4. Match the enter
const reg = /\s/;
reg.test(' ')    // true
reg.test('\n')   // true
reg.test(' ') // true
reg.test(`
`) // true
reg.test('1')    // false
Copy the code
  1. Define reg to match any blank line operation
  2. Matches Spaces, true
  3. Matches a newline, true
  4. Match TAB, true
  5. Matches a newline string, true
  6. Match string, false

5.2.7 \S matches any non-whitespace character

Concept: Matches any non-whitespace character, which is actually a character with content

const reg = /\S/;
reg.test(1);    // true
reg.test('a');  // true
reg.test('A');  // true
reg.test(The '%');  // true
---
reg.test(' ');  // false
reg.test('\n'); // false
Copy the code
  1. Reg sets to match non-whitespace characters
  2. Matches Latin letters, true
  3. Matches the operation character, true
  4. Matches whitespace, false;

Character class, grasp 👆 these are enough, the rest will not continue to expand


5.3 Group and Scope

Concept: Combine scopes, covering all scopes

5.3.1 | x or y

Concept: Match x or y

const reg = /A|B/
reg.exec('AC') // ["A", index: 0, input: "AC", groups: undefined]
reg.exec('CB') // ["B", index: 1, input: "CB", groups: undefined]
reg.exec('CD') // null
---
const reg = /A|B/g
reg.exec('ABC') // ["A", index: 0, input: "ABC", groups: undefined]
reg.exec('ABC') // ["B", index: 1, input: "ABC", groups: undefined]
reg.exec('ABC') // null
Copy the code
  1. Non-global mode
    1. Match A or B
    2. If AC passes, true
    3. If CB passes, true
    4. CD failed, null
  2. The global model
    1. ABC, matching A for the first time
    2. ABC, matches B the second time
    3. ABC, no match, return null

5.3.2 [x-y] Specifies a character range

For example, if the preceding | operation matches 0-3, it would have to be 0 | 1 | 2 | 3, but using [0-3] allows you to quickly specify a range of characters

const reg = / [0, 3]
reg.exec('0') // ["0", index: 0, input: "0", groups: undefined]
reg.exec('1') // ["1", index: 0, input: "1", groups: undefined]
reg.exec('2') // ["2", index: 0, input: "2", groups: undefined]
reg.exec('3') // ["3", index: 0, input: "3", groups: undefined]
reg.exec('4') // null
Copy the code

Note: – Only in the middle, such as:

  1. a-z
  2. 0-9

If – is at the beginning and end, the default is a character, for example:

  1. -az
  2. – 09
const reg = /[-az]/;
reg.exec('a'); // ["a", index: 0, input: "a", groups: undefined]
reg.exec('b'); // null
reg.exec(The '-'); // ["-", index: 0, input: "-", groups: undefined]
Copy the code

As you can see, the [xyz] mode is much more flexible than |, so you can just use [xyz] mode if it is easy to remember


5.3.3 [^x-y] Negates the specified character range

Concept: it makes sense to specify a character range negation for 👆

const reg = / [^ 0, 3]
reg.exec('0'); // null
reg.exec('1'); // null
reg.exec('2'); // null
reg.exec('3'); // null
reg.exec('4'); // ["4", index: 0, input: "4", groups: undefined]
Copy the code
  1. Reg is set to negate 0-3
  2. So 0, 1, 2, 3 doesn’t match
  3. 4 Can match
  4. For other – rules, see 👆

5.3.4 (X) Group capture

Concept: Grouping by () parentheses, the return value will be a number, which in turn returns matching values, grouping results

const reg = / (.). (.). . /;
reg.exec('ABC'); // ["ABC", "A", "B", index: 0, input: "ABC", groups: undefined]
RegExp$1.// A
RegExp$2.// B
reg.exec('AB');  // null;
Copy the code
  1. Defines reg.The purpose is to match three arbitrary characters
    1. Two of them.Wrapped in parentheses, defined as a group
  2. Match ABC to return an array of three values
    1. Matching value: ABC
    2. The value captured by the first group: A
    3. The value captured by the second group: B

Found that the values captured by the group have the starting index of 1, which is the same as the starting index of $


5.3.4.1 (x) Group nesting logic

Since you can group, there must be nested groups. What is the execution logic of nested groups?

const reg = / ((.). . (.). ) ((.). (.). ) /;
reg.exec('ABCDE');
// ["ABCDE", "ABC", "A", "C", "DE", "D", "E", index: 0, input: "ABCDE", groups: undefined]
Copy the code
  1. Reg is a little more complicated:
    1. There were two large groups
      1. Group 1 on the left
        1. Panel 1: Any character
        2. Any character
        3. Panel 2: Any character
      2. Group 2 on the right
        1. Panel 3: Any character
        2. Panel 4: Any character
  2. Analysis from results
    1. ABCDE: indicates the matching value
    2. ABC: matching result of large group 1 on the left
    3. A: The matching result of group 1 in group 1
    4. C: the matching result of group 2 in group 1
    5. DE: matching result of group 2 on the right
    6. D: The matching result of group 3 in group 2
    7. E: the matching result of group 4 in group 2

Conclusion: You can see that the output is a deep traversal process, first take one group, then take another group


5.3.5 \NUM Backreference

Concept:

  1. Once again, matches are placed on the RegExp’s 1, 1, 1, 2, $3;
  2. But that’s all after a match, so what if you want to use it when you’re defining regular expressions?
  3. The \NUM or \N is used to solve this problem

Grammatical level:

  1. \ 1 = $1
  2. \ 2 = $2
  3. \ 3 = $3
  4. .

Not very well understood, need to carefully analyze through the case

const reg = / (.). \ 1 /;
reg.exec('A')  // null
reg.exec('AB') // null
reg.exec('AA') // ["AA", "A", index: 0, input: "AA", groups: undefined]
Copy the code
  1. Reg definition
    1. (.). : Matches any character as the first group
    2. \1: This means to take the $1 value from the matched group and use it directly
  2. Match A, null due to incorrect length
  3. Match the AB
    1. A matches as the first group, so $1 is A
    2. In this case \1 is equal to A
    3. Then you take B and compare A, and it doesn’t match
    4. Returns null
  4. Match the AA
    1. A matches as the first group, so $1 is A
    2. In this case \1 is equal to A
    3. And then take A and compare A, and match
    4. Returns the result

We need to look at a few more examples:


5.3.5.1 Match four digits, the first and third digits are the same

// Match four digits, the first and third digits are the same
const reg = /(\d)\d\1\d/;
reg.exec(1234) // null;
reg.exec(1214) // ["1214", "1", index: 0, input: "1214", groups: undefined]

// Match 4 numbers, 13 is the same, 24 is the same
const reg = /(\d)(\d)\1\2/;
reg.exec(1234) // null
reg.exec(1214) // null
reg.exec(1312) // null
reg.exec(1212) // ["1212", "1", "2", index: 0, input: "1212", groups: undefined]
Copy the code
  1. Reg sets to match four digits
    1. (\d) The first group, matches the number
    2. \d Matching number
    3. \1 matches $1, which is the first group
    4. \d Matching number
  2. 1234 matched
    1. $1 is 1
    2. 3 and 1 are not equal, so return null
  3. 1214 matched
    1. $1 is 1
    2. 1 and 1 are equal, return

5.3.5.2 Match four numbers, the 13th is the same, the 24th is the same

// Match 4 numbers, 13 is the same, 24 is the same
const reg = /(\d)(\d)\1\2/;
reg.exec(1234) // null
reg.exec(1214) // null
reg.exec(1312) // null
reg.exec(1212) // ["1212", "1", "2", index: 0, input: "1212", groups: undefined]
Copy the code
  1. Reg sets to match four digits
    1. (\d) The first group, matches the number
    2. (\d) The second group, matches the number
    3. \1 matches $1, which is the first group
    4. \2 matches $2, which is the second group
  2. 1234 matched
    1. $1 is 1
    2. 3 and 1 are not equal, return null
  3. 1214 matched
    1. $2 is 2
    2. 4 and 2 are not equal, return null
  4. 1312 matched
    1. $2 is 3
    2. 3 and 2 are not equal, return null
  5. 1212 matched
    1. 1 is 1, 1 is 1, 1 is 1, 1 is 1, 2 is 2
    2. 1 =1, 2=1, 2=1, 2=2, 1 =1, 2=2

5.3.5.3 Match 4 numbers, the 12th and 34th are the same

const reg = /(\d)\1(\d)\2/;
reg.exec('1234') // null
reg.exec('1233') // null
reg.exec('2233') // ["2233", "2", "3", index: 0, input: "2233", groups: undefined]
Copy the code
  1. Reg sets to match four digits
    1. (\d) The first group, matches the number
    2. \1 matches $1, which is the first group
    3. (\d) The second group, matches the number
    4. \2 matches $2, which is the second group
  2. 1234 matched
    1. $1 is 1
    2. If 2 is not equal to 1, return null
  3. 1233 matched
    1. $1 is 1
    2. If 2 is not equal to 1, return null
  4. 2233 matched
    1. $1 is 2
    2. 2 = $1
    3. $2 is 3
    4. $2 = $2

5.3.6 (?) Backreference – named capture group

Concept:

  1. The previous backreferences were grouped around
  2. () Parentheses can be nested, so groups can be layered on top of each other, which is very responsible, so it is necessary to name groups

Grammar:

  1. The name (?)
  2. Reference \ k
const reg = / (? 
      
       \d)(? 
       
        \d)/
       
reg.exec(12); // ["12","1","2",groups: {num1: "1", num2: "2"}]
Copy the code
  1. Redefine reg into two groups
    1. Group 1 is called NUM1
    2. Group 2 is called NUM2
  2. 12 Matching
    1. The result is groups, including num1 and num2

5.3.7 \k Backreference – named reference

Above through (?) Named, but how to use it? Grammar is:

  1. The name (?)
  2. Reference \ k

Note that the \k is fixed, because I don’t know whether to \1 or \2, so \k

const reg = / (? 
      
       \d)\k
       
        /
       ;
reg.exec(12) // null
reg.exec(11) / / / "11", "1", the index: 0, input: "11", groups: {...}]
Copy the code
  1. Redefine reg as:
    1. (? \d) : the group name is num, and the matching rule is any number
    2. \k: Refers to a group named num

In fact, it is the same as the ability of \1 above, but one has a name and the other has no name


5.3.8 (? :x) Non-capture group

Concept:

  1. The concept of capture groups is clear: you group content together, and the results are printed in the matching results
  2. But here’s the problem: not every result in each group makes sense
  3. So how do you keep only meaningful capture groups

(? :x) : Used to indicate that output is not required in the result

const reg = /(\d)\d/;
reg.exec(12) // ["12", "1", index: 0, input: "12", groups: undefined]

---
const reg = / (? :\d)\d/;
reg.exec(12) // ["12", index: 0, input: "12", groups: undefined]
Copy the code
  1. The only difference between the two regs is the addition of (? :x)
  2. The results on
    1. The first group is printed to the result
    2. The second group is not printed

5.4 quantifiers

Concept: Control the number of matches


5.4.1 * Matches 0 or more times

const reg = /AB*/;
reg.exec('A');      // ["A", index: 0, input: "A", groups: undefined]
reg.exec('AB');     // ["AB", index: 0, input: "AB", groups: undefined]
reg.exec('ABB');    // ["ABB", index: 0, input: "ABB", groups: undefined]
reg.exec('ABC');    // ["AB", index: 0, input: "ABC", groups: undefined]
reg.exec('ABBBBC'); // ["ABBBB", index: 0, input: "ABBBBC", groups: undefined]
reg.exec('AC');     // ["A", index: 0, input: "AC", groups: undefined]
Copy the code
  1. Define the reg:
    1. Match A
    2. 0 or more B numbers are matched
  2. Matching A: A passes, and B passes 0
  3. AB matching: A passes and B passes
  4. ABB matching: A passes and B passes
  5. ABC match: A passes and B passes
  6. ABBBC matches: A passes, B passes 4 in total
  7. AC matching: A passes, and B passes 0

Note: 0 is also ok, which is easier to ignore


5.4.2 + Matches one or more times

Concept: the only difference with * is that + must have at least one

const reg = /AB+/;
reg.exec('A');      // null
reg.exec('AB');     // ["AB", index: 0, input: "AB", groups: undefined]
reg.exec('ABB');    // ["ABB", index: 0, input: "ABB", groups: undefined]
reg.exec('ABC');    // ["AB", index: 0, input: "ABC", groups: undefined]
reg.exec('ABBBBC'); // ["ABBBB", index: 0, input: "ABBBBC", groups: undefined]
reg.exec('AC');     // null
Copy the code

The only thing to remember is that the plus has to have 1, and the star can have 0


5.4.3? Matches 0 or 1 times

const reg = /AB? /;
reg.exec('A')   // ["A", index: 0, input: "A", groups: undefined]
reg.exec('AB')  // ["AB", index: 0, input: "AB", groups: undefined]
reg.exec('ABB') // ["AB", index: 0, input: "ABB", groups: undefined]
Copy the code
  1. Define the reg:
    1. Match A
    2. Match B zero times or one time
  2. Matching A: A passes, and B passes 0
  3. AB matching: A passes and B passes once
  4. ABB matching: A passes and B passes once

Note:? If multiple B’s exist, only one B is returned!!


5.4.3.1? Non-greedy behavior

As mentioned earlier:

    • Matches 0 or more times
    • Matches one or more times

What happens if you combine them?

const reg = /AB+/
reg.exec('AB')   // ["AB", index: 0, input: "AB", groups: undefined]
reg.exec('ABB')  // ["ABB", index: 0, input: "ABB", groups: undefined]
reg.exec('ABBB') // ["ABBB", index: 0, input: "ABBB", groups: undefined]
---
const reg = /AB+? /
reg.exec('AB')   // ["AB", index: 0, input: "AB", groups: undefined]
reg.exec('ABB')  // ["AB", index: 0, input: "ABB", groups: undefined]
reg.exec('ABBB') // ["AB", index: 0, input: "ABBB", groups: undefined]
Copy the code
  1. Reg is set to AB+, and B will match one or more times
    1. And it turns out that’s exactly what the match was
  2. Set reg to AB+? B becomes non-greedy
    1. As you can see from the result, B only matched once

It’s not a very practical concept, so just keep an eye on it


5.4.4 X {n} matches n times

In the previous talk, all matches are 0 times, 1 time, many times, and can not be very flexible customization, for example, what if you want to match 2 times, 3 times, 4 times? Concept: x{n} is designed to solve this problem. It must be clear that n matches the preceding rule, not the preceding character

const reg = /A{2}/
reg.exec('A')  // null
reg.exec('AB') // null
reg.exec('AA') // ["AA", index: 0, input: "AA", groups: undefined]
---
const reg = /. / {2}
reg.exec('A');  // null
reg.exec('AB'); // ["AB", index: 0, input: "AB", groups: undefined]
reg.exec('AC'); // ["AC", index: 0, input: "AC", groups: undefined]
Copy the code
  1. Set the number of regs to A to 2
  2. A match: A has one match and fails
  3. AB match: A has one match and the match fails
  4. AA match: Two A’s pass
  5. Set reg to any 2 characters
  6. A Match: Contains one character and is not passed
  7. AB Match: 2 characters, passed
  8. AC Match: 2 characters, passed

Note:

  1. {n} must be continuous
  2. {n} Look at the second example match.Any characters. It is not important whether two characters are the same, but whether the matching rules are the same

5.4.5 X {n,} matches at least n times

X {n} above is forced to match n times, but cannot be achieved, at least how many times the concept: x{n,} ability, is at least how many times the match

const reg = /A{3,}/;
reg.exec('A')    // null
reg.exec('AA')   // null
reg.exec('AAA')  // ["AAA", index: 0, input: "AAA", groups: undefined]
reg.exec('AAAA') // ["AAAA", index: 0, input: "AAAA", groups: undefined]
Copy the code
  1. Define reg: match A at least three times
  2. A Match: A total of 1, failed
  3. AA matching: Two A’s fail
  4. AAA matching: A total of three A’s pass
  5. AAAA Match: A Total of four matches are passed

5.4.6 X {n,m} matches n-m times

const reg = / A / {0, 2}
reg.exec('B')   // ["", index: 0, input: "B", groups: undefined]
reg.exec('A')   // ["A", index: 0, input: "A", groups: undefined]
reg.exec('AA')  // ["AA", index: 0, input: "AA", groups: undefined]
reg.exec('AAA') // ["AA", index: 0, input: "AAA", groups: undefined]
Copy the code
  1. Define reg to match character A 0-2 times
  2. B matches: A0, passes, returns “”, this is important to note
  3. A Match: A1 passes
  4. AA match: A2 passes
  5. AAA match: A3 pass, but only AA is returned

5.5 modifier

Concept: Modifiers add global rules to regular expressions

5.5.1g Global search

const reg = /A/g;
[...'ABA'.matchAll(reg)];
/ / /
// ["A", index: 0, input: "ABA", groups: undefined],
// ["A", index: 2, input: "ABA", groups: undefined
/ /]
Copy the code

And you can see that everything matches by PI over g in one breath


5.5.2 I is case insensitive

const reg = /A/i;
reg.exec('A') // ["A", index: 0, input: "A", groups: undefined]
reg.exec('a') // ["a", index: 0, input: "a", groups: undefined]
Copy the code

It makes sense that if YOU set I, the case will match


5.5.3 m Multi-row search

const reg = /^A/
reg.exec(`
A`) // null
 
---
const reg = /^A/m
reg.exec(`
A`) // ["A", index: 1, input: "\nA", groups: undefined]
Copy the code

It also makes sense that with m you can break down the text into lines to match


5.5.4 s allowed.Match newline characters (whitespace)

const reg = / /.;
reg.exec('\n') // null

---
const reg = / /.s
reg.exec('\n') // ["\n", index: 0, input: "\n", groups: undefined]
Copy the code

Before setting s, whitespace characters cannot be matched. After setting s, newline characters can be matched normally


5.5.5 Y Sticky match (difficult to understand)

We’ve seen before that G is a global match, and exec gives you a property called lastIndex, so let’s see

const reg = /A/g
reg.exec('AA A'); // ["A", index: 0, input: "AA A", groups: undefined]
reg.lastIndex;    / / 1

reg.exec('AA A'); // ["A", index: 1, input: "AA A", groups: undefined]
reg.lastIndex;    / / 2

reg.exec('AA A'); // ["A", index: 3, input: "AA A", groups: undefined]
reg.lastIndex;    / / 4

reg.exec('AA A'); // null
reg.lastIndex;    / / 0
Copy the code
  1. Set reg to global match A
  2. AA A match for the first time:
    1. Pass, match the position of index 0, mark next start lastIndex is 1
  3. AA A matches the second time:
    1. Pass, match the position of index 1, mark the next start lastIndex is 2
  4. AA A match the third time:
    1. It matches the position of index 3, marking the next start with lastIndex 4

As you can see, on the second match, the lastIndex that marks the start of the next match is at position 2, but on the third match, the result is found at position 3. This means that g’s global search pattern, on the third search, automatically searches backwards, and the lastIndex is not found at position 2, so it goes backwards to position 3

But the problem with this is that when you have too much content, you’re going to go all the way to the end of the bar. The purpose of /y is to match exactly to lastIndex

const reg = /A/y

reg.lastIndex;    / / 0
reg.exec('AA A'); // ["A", index: 0, input: "AA A", groups: undefined]
reg.lastIndex;    / / 1

reg.exec('AA A'); // ["A", index: 1, input: "AA A", groups: undefined]
reg.lastIndex;    / / 2

reg.exec('AA A'); // null
reg.lastIndex;    / / 0
Copy the code
  1. Reg sets the stickiness match
  2. AA A matches for the first time, starting from lastIndex = 0, strictly matches, and does not move backward:
    1. Pass, match the position of index 0, mark next start lastIndex is 1
  3. AA A matches the second time, starting from lastIndex = 1, strictly matches and does not move backward:
    1. Pass, match the position of index 1, mark the next start lastIndex is 2
  4. AA A matches A third time, starting with lastIndex = 2, strictly matches, and does not move backward:
    1. Subscript 2 is blank, no match, null is returned

In fact, the essential difference between Y and G is:

  1. Y matches strictly according to lastIndex. If the match fails, the result is terminated
  2. If g does not match lastIndex, it will continue to match backwards

Let’s take a look at the simplest example:

const reg = /A/g
reg.lastIndex;   / / 0
reg.exec('BA');  // ["A", index: 1, input: "BA", groups: undefined]

---
const reg = /A/y
reg.lastIndex;   / / 0
reg.exec('BA');  // null
Copy the code

Conclusion: regular master here even ok, the above information are from MDN