Category: Regular Expression Blog: blog.csdn.net/qtfying Nuggets: juejin.cn/user/430094… QQ: 2811132560 Email: [email protected]

The above regular expressions are simple, but before I get to them, I’d like to talk about regular expressions, both to improve readability and to help the reader make a step by step transition and understanding

A regular basis

Method to create a regular expression

Regular expressions can be constructed in JavaScript in two ways.

  • The first is to wrap the regular expression in two forward slashes
  const regex = /cat/;
Copy the code
  • The second option is to use the RegExp constructor
  const regex = new RegExp('cat');
Copy the code

Method of use

  • Using methods is simple. Both call the test function of the re, passing in the value to be tested
  regex.test('cat');
  regex.test('persian-cat')
Copy the code
  • However, there is a simpler way to use regular expressions, which is the simplest type of regular expression. The ability to find the matching type directly in the string.
  /cat/.test('cat');
  /cat/.test('persian-cat');
  /cat/.test('woca tmd');
Copy the code

Use special characters for more complex functions

Any character —.

It’s represented by a dot. Matches any single string other than a newline character

  const regex = /.og/;
  regex.test('fog');  // true
  regex.test('dog');  // true
Copy the code

A wildcard is one of the special characters. What if you want to match a dot. Character?

Escape character — –\

The backslash \ is used to switch the meaning of special characters to ordinary characters. So it is possible to search for dot. Characters in text, and the dot is not interpreted as a special character.

  const regex = /dog./;
  regex.test('dog.');   // true
  regex.test('dog1');   // true

  const regex = /dog\./;
  regex1.test('dog.');  // true
  regex.test('dog1');   // false
Copy the code

Character set — –[]

Denoted by square brackets []. This pattern is used to match a character, which can be any character in parentheses.

  /[dfl]og/.test('dog'); // true
  /[dfl]og/.test('fog'); // true
  /[dfl]og/.test('log'); // true
Copy the code

Note the special characters inside the string (e.g. No longer special, so the backslash \ is not needed here. Let’s look at some other characters:

  /[A-z].test('abc') /;// true
  /[A-z].test('Z') /;// true
Copy the code

Note that if you use a character set to match characters, always use uppercase letters first. This means that /[a-z]/ raises an error

  const pattern = /[a-Z]/;
Copy the code

Uncaught SyntaxError: Invalid regular expression: /[a-Z]/: Range out of order in character class

Since I can match characters forward and backward, what if I want to reverse, for example, if I match a string that doesn’t contain df, what do I do

  /[^df]og/.test('dog'); // false
  /[^df]og/.test('fog'); // false
  /[^df]og/.test('log'); // true
Copy the code

If you want to match a string that has both case and case, use caution[A-Za-z]In this case, it is best to use a case insensitive flagiTo ignore

Repeated many times{}

To match the exact number of occurrences of an expression, we can use {}. Let’s use an example. Suppose we match a phone number in the format +xx XXX XXX XXX XXX:

  function isPhoneNumber(number){
      return / \ + [0-9] {2} {3} [0-9] [0-9] {3} [0-9] {3} /.test(number);
  }

  isPhoneNumber('+ 12, 123, 123, 123'); // true
  isPhoneNumber('123212'); // false
Copy the code

Note that we have done some customization here:

  • {x}Exactly match x occurrences
  • {x,}Match at least x times
  • {x, y}Match at least x times and no more than y times
Zero or more repetitions/. * /

An expression with an asterisk * can match 0 or more times. It is effectively equivalent to {0,} so that we can easily construct a pattern that matches any number of characters: /.*/

The modifier

The modifier describe
i Performs case-insensitive matching
g Perform global matches (find all matches instead of stopping after finding the first match)
m Perform multi-line matching
Ignore case —i
  /dog/i.test('dog'); // true
  new RegExp('dog'.'i').test('DoG');
Copy the code
The global matching
  var str = 'aa'
  var reg1 = /a/;
  str.match(reg1)  ["a", index: 0, input: "aa"]
  var reg2 = /a/g;
  str.match(reg2)  // result is: ["a", "a"]
  console.log(reg2.lastIndex)    / / 0
  alert(reg2.test(str))   // true
  console.log(reg2.lastIndex)   / / 1
  alert(reg2.test(str));   // true
  console.log(reg2.lastIndex)  / / 2
  alert(reg2.test(str));  // false
  console.log(reg2.lastIndex) / / 0
  alert(reg2.test(str));  // true
Copy the code

As can be seen from the above example, we can sum up the following points:

  • Normally, the re is paused when it matches the first item
  • In the case of a global match, the match will continue until the last item cannot be matched
  • The re has a lastIndex property, which is automatically incremented by one each time it is tested until it is reset and no position is matched. This is also called non-reentrant
  • After each test, you can manually reset regex.lastindex = 0

Let’s look at one more application scenario

  const lorem = 'I_want_to_speak_english';
  lorem.replace('_'.' ');  // 'I want_to_speak_english'
  lorem.replace(/_/g.' '); // 'I want to speak english'
Copy the code

The replace function comes with iteration, which replaces all matched objects with the contents of the second argument

Multi-line mode —m

Originally thought in the end the multiline mode, in order to integrity, or on the modifier that, each modifier and regular expression operators are strung together, in the sometimes-complex mix-and-match I have you, it is hard to fully opened to a specific point, singled out, since a newline, must be matching can contain elements in each row, and look at the sample:

  const pets = ` dog cat parrot and other birds `;

  /^dog$/m.test(pets); // true
  /^cat$/m.test(pets); // true
  /^parrot$/m.test(pets); // false
Copy the code

As you can see, it changes the meaning of caret and dollar sign. In multi-line mode, they represent the beginning and end of a line, not the entire string. It is also valid for newline \n. In the example parrot is matched in the third line of PETS, but does not end in parrot, so it is false

– capture group(a)

For example, (\d)\d, “(\d)” is a capture group, and the number of () represents the number of groups. What does this mean? Is convenient to us match to the object code, by number, at the same time it also allows us to be named, so that we can easily through the $1, $2 to capture groups such as one-to-one to match object in a value, such as:

Serial number named Capture group Match the content
0 (\d{4})-(\d{2}-(\d\d)) 2008-12-31
1 (\d{4}) 2008
2 (\d{2}-(\d\d)) 12-31
3 (\d\d) 31

If we name it, we can tweak it a little bit

Serial number named Capture group Match the content
0 (? <year>\d{4})-(? <date>\d{2}-(? <day>\d\d)) 2008-12-31
1 year (? <year>\d{4}) 2008
2 date (? <date>\d{2}-(\d\d)) 12-31
3 day (? <day>\d\d) 31

You can also do local naming, which is a mixture of the above, and look like this

Serial number named Capture group Match the content
0 (\d{4})-(? <date>\d{2}-(\d\d)) 2008-12-31
1 (\d{4}) 2008
2 date (? <date>\d{2}-(\d\d)) 12-31
3 (\d\d) 31

To look back/(\w+)\s(\w+)/

/(\w+)\s(\w+)/ (\w+)\s(\w+)/

The modifier describe
\w Matches letters, digits, and underscores. Equivalent to ‘[A Za – z0-9 _]’
\s Matches any whitespace character, including Spaces, tabs, page feeds, and so on
(a) Capture group

More see

/(\w+)\s(\w+)/ /(\w+)\s(\w+)/)\s(\w+)/

  var re = /(\w+)\s(\w+)/;
  var str = "zara ali haha hehe";
  var newstr1 = str.replace(re, $2, $1, $3");
  console.log(RegExp. $1)
  console.log(RegExp. $2)
  console.log(RegExp. $3)
  console.log(newstr1);
Copy the code

And you can see what are these four outcomes

Zara Ali, Zara, $3 haha heheCopy the code

There are only two capture groups, respectively

  • The $1 -> ‘zara’
  • $2 -> ‘ali’

There is no third capture group at all, so $3 is the empty RegExp constructor that calls replace, replaces the matched capture group with $2, replaces the matched capture group with $1, and concatenates the original unmatched capture group. That’s Ali, Zara, haha hehe. Yeah! That’s true, but the replace method concatenates $n when $n is empty, so it gets the expected ali, Zara, $3 haha hehe, and that’s the magic of regular expression.

Isn’t life like this, full of surprises, like the moon in the well, like water in the desert, like shooting stars over the polar regions… People like these unknown, fascinated by these unknown, to explore diligently…

Written in the afternoon of December 30, 2019