Do you often come across regees but have no idea what they’re talking about? Do you skip them even when you see them in a project? Do you forget what to search for all day long and have no idea that some editors support fuzzy searches for regees?

Familiar with the melody lingering in the ear, but has not been the original youth.

After working for a long time, I suddenly realized that the regex I had neglected before was so important. Search, query, command line, checksum, intercept, replace ………… You can’t do anything without this little thing. What a lot of people don’t know is that regs are a second world thing in JavaScript. It looks simple, but there are a lot of things that confuse us.

Let’s take a closer look at regular expressions and the power of regular expressions as we walk through this article!

1. Create a regular expression

There are two ways to create regular expressions in JavaScript

The first is constructor creation

Const reg = new reg (' regular expression ')Copy the code

Second: literal creation

Const reg = / regular expression /Copy the code

2. Use regular expressions to match content

Regular matching is divided into precise matching and fuzzy matching

Precise matching

All we need to do is put what we need to match into the regular expression and match it. Use simple

Const reg = / regular expression/const STR = 'This is an article that writes regular expressions is an article' str.match(reg) // regular expressionsCopy the code

The regular expression here will find exactly what we’re looking for.

Fuzzy matching

Fuzzy matching needs to be described in point 3, so let’s look at it first

Const STR = /[\u4e00-\u9fa5]+/ const STR = 'Regpx' STR. Match (reg) //Copy the code

The regular expression here matches the search based on the rules we pass in. \ u4E00 -\ u9FA5 here indicates the Chinese range

\ u4E00 indicates the first Chinese character in the character set

\ U9FA5 represents the last Chinese character in the character set

Ok, now that you know how to create and use a re, let’s get to know it for real.

3. Regular expressions

Those familiar with metacharacters can skip this section and go straight to section 4. The main contents are. | \ * +? () [] {} ^ $Several character usage methods, meanings and considerations, built-in matching, greedy and lazy matching.

First of all, let’s introduce the forward-looking conditions. The re can carry out multi-line and global, local and case-insensitive matching modes according to its own Settings, which can be divided into:

  • m: Multi-line matching
  • g: Global matching
  • i: Ignore case

And the next thing you know, you can see how we can use it

3.1 yuan character

Metacharacters are the most basic part of regular expressions and represent what we can query. On baby. Let’s get to know them.

  • Dot. : matches everything except \n

    const reg = /.*/ const str1 = ‘abc’

    // Add \n line break to str2 const str2 = ‘a\nb\nc’

    str1.match(reg) // abc str2.match(reg) // a

  • Or | : in regular logic, (say are abstract, we’ll look at examples)

    Const reg = / regular | express/const str1 = ‘regex’ const str2 = ‘expression’

    Str1. match(reg) // Rematches str2.match(reg) // expression

Explanation:

If you can match in front of | content, matching the front.

Can’t match the preceding content, but can match the following content, matches the latter

  • Backslash \ : indicates escape

    const reg = /./ const str = ‘a.b.c.d’

    str.match(reg) // .

A dot that can match any character except \n. With \ escape, it can only match. This character

Points to note:

In some programming languages, matching \\ requires four backslashes — \\\\. But in javascript, you only need two.

const reg = /\\/
const str = '\\'

str.match(reg) // \\
Copy the code
  • Asterisk * : the content to be matched appears for zero or more consecutive times

    const reg = /a*/ const str1 = ‘aaabbcccddd’ const str2 = ‘bbcccddd’ reg.test(str1) // true reg.test(str2) // true

  • Plus + : the content to be matched appears one or more times consecutively

    const reg = /a+/ const str1 = ‘aaabbcccddd’ const str2 = ‘bbcccddd’ reg.test(str1) // true reg.test(str2) // false

  • The question mark? : Indicates that the content to be matched appears zero consecutive times or once

    const reg = /a? / const str1 = ‘aaabbcccddd’ const str2 = ‘bbcccddd’

    reg.test(str1) // true reg.test(str2) // true

  • Brackets () : indicates a group. The contents in the brackets indicate a group. The whole group is matched.

    const reg = /(abc)/ const str1 = ‘abab’ const str2 = ‘abcab’ reg.test(str1) // false reg.test(str2) // true

Development:

Matches wrapped in parentheses can be called in subsequent regees as \ + ordinals. There’s a couple of parentheses there’s a couple of numbers. The serial number starts at 1

Const reg = / (ABC) \ 1 / / / same as const reg = '(ABC) ABC' const STR = 'abcabc' STR. The match (reg) / / / (a) (b) (c) 1 \ \ 2 / - > /(a)(b)(c)ab/Copy the code
  • Brackets [] : indicates the range. Only one of the contents in brackets can be matched

    const reg = /[abc]/ const str = “efg” const str1 = ‘aef’

    reg.test(str) // false reg.test(str1) // true

Note:

You can use – in brackets to indicate contiguous ranges

/[0-9]/ 0123456789 /[A-z]/ all lowercase /[A-z]/ all uppercaseCopy the code
  • Curly braces {m,n} : indicates the number of occurrences

M is the minimum number of occurrences, n is the maximum number of occurrences, and n can be omitted, it’s infinite

Const STR = 'aaaaa' reg.test(STR) // trueCopy the code

Development:

{0,} is equal to *

{1,} equals +

{0,1} corresponds to?

  • Beginning ^ : Matches only the beginning content

    const reg = /a^/ const str = ‘abc’ const str1 = ‘bac’

    reg.test(str) // true reg.test(str1) // false

Note:

This symbol is used in [] to indicate the reverse operation

const reg = /[^abc]/g 
const str = '123'
str.match(reg) // ['1', '2', '3']
Copy the code

You can only fetch things that are not ABC

  • End $: Matches only end

    const reg = /a$/ const str = ‘abc’ const str1 = ‘cba’

    reg.test(str) // false reg.test(str1) // true

3.2 Built-in card Characters

\ d: Numbers

\D: Not a number

\ s: space

\S: non-blank

\ W: alphanumeric underscore

\W: Non-numeric alphanumeric underscore

\b: Character boundaries

\B: Non-character boundaries

const str = 'hello word! 520' console.log(str.replace(/\d/g, '*')) // hello word! *** console.log(str.replace(/\D/g, '*')) // ************520 console.log(str.replace(/\w/g, '*')) // ***** ****! *** console.log(str.replace(/\W/g, '*')) // hello*word**520 console.log(str.replace(/\b/g, '*')) // *hello* *word*! *520* console.log(str.replace(/\B/g, '*')) // h*e*l*l*o w*o*r*d! * 5*2*0 console.log(str.replace(/\s/g, '*')) // hello*word! *520 console.log(str.replace(/\S/g, '*')) // ***** ***** ***Copy the code

3.3 Greed and inertia

Greedy matching is at most matched content

Lazy matching is the least content matches

Look at the chestnuts

Const STR = 'abcbcbc' console.log(str.match(reg)) // ['abcbcbc'] const reg1 = /a[BC]*? C /g // Inert console.log(str.match(reg1)) // [' ABC ']Copy the code

Explanation:

  • Greed: due toabcbcbcAll the way to the end of the string matches the rules of the regex, so it matches all the way to the wrong position
  • Lazy: only one match is returned and no further match is made.

4. Regular advanced operations

4.1 Named Group Matching

The ability to match groups is provided in JavaScript. Matches can be obtained with custom group names.

The fixed format for groups is? < group name >. Then get it by getting the groups. Specific name of the result

const reg = /(? < name > [a zA - Z] {3, 6}) (? <age>\d{2})/ const str = 'xiaoming23' str.match(reg).groups.name // xiaoming str.match(reg).groups.age // 23Copy the code

4.2 Position Matching

  • ? = Search content: Check whether the content on the right meets the requirements and whether the suffix is the content to be searched.

    const reg = /(? =abc)/ const str = ‘efabcefabc’

    str.replace(reg, ‘‘) // efabcef*abc

Explanation:

Matches the position of the character suffix ABC

  • ? ! Search content: Check whether the content on the right does not meet the requirements and whether the suffix is the content to be searched

    const reg = /(? ! abc)/ const str = ‘efabcefabc’

    str.replace(reg, ‘‘) // efabcefabc

Explanation:

Matches positions where the suffix is not ABC

  • ? <= Search content: Check whether the content on the left meets the criteria and whether the prefix is the content to search

    const reg = /(? <=abc)/ const str = ‘efabcefabc’

    str.replace(reg, ‘‘) // efabcefabc*

Explanation:

Matches positions prefixed with the character ABC.

  • ?

    const reg = /(? <! abc)/ const str = ‘efabcefabc’

    str.replace(reg, ‘‘) // efabcefabc

Explanation:

Matches positions that are not prefixed with ABC characters.

Section iv Summary:

  1. Set of matching? The < group name >
  2. Match the suffix? =? !
  3. Matching prefix? < =? <!

5. Introduction to regular and string methods

5.1 String Methods

The following methods all support regular matching to operate.

  • Replace (reg, the content of a replacement or an operation function) replace

The second argument to this method is the magic one, accepting either what you want to replace or a function.

  • For replacement content (1)

    const str = ‘abc’

    str.replace(/a/, ‘*’) // *bc

  • For replacement content (2) use $1,$2………… Etc as the content of the matched group.

    const str = ‘123abc’

    str.replace(/(a)(b)(c)/, ‘
    1 1
    3$2′) // 123acb

Description:

$1 represents what (a) matches

$2 represents the content matched by (b)

$3 represents the content matched by (c)

If the sequential substitution is dropped during substitution, the value 123acb is printed

  • For the function

    const str = ‘abc’ str.replace(/(a)/, (source, 1, index) => { console.log(source, 1, Index)/ / ABC a 0 return ‘*’}) // *ab str.replace(/(b)(c)/, (source, 1,1, 1, index) => {console.log(source, 1,1, index) 1, index) // ABC bc 1 return $1 + 1}) // a11

Description:

Function arguments receive arguments, with the first source being the string itself and the last index being the first index value found. $1, $2, $3……………… Is the number of parentheses in the regular expression. The return value replaces the original string as a replacement.

  • Match the search

Matches strings according to the rules of the re

const reg = /[abc]/g
const str = 'abc'
str.match(reg) // [ 'a', 'b', 'c' ]
Copy the code
  • The split cutting

    const str = ‘abcabcabc’ str.split(/ca/) //[ ‘ab’, ‘b’, ‘bc’ ]

This method can also accept a second parameter. Length sets the length of the returned array

const str = 'abcabcabc'
str.split(/ca/, 2) //[ 'ab', 'b' ]
Copy the code

5.2 Regular Method

  • Test Checks whether the string complies with the regular rule

    const reg = /abc/

    const str = ‘abc’

    const str1 = ‘ab’

    reg.test(str) // true

    reg.test(str1) // false

  • Exec matches the string according to the rule of the re. With the match

    const reg = /abc/

    const str = ‘abc’

    const str1 = ‘ab’

    reg.exec(str) // ‘abc’

    reg.exec(str1) // null

Article 6. Practical experience

After understanding the content, how also need to practice ah, this content to prepare a few common but not very easy to understand the re, please practice your hands.

6.1 Matching HTML Tags (Including tags)

First, let’s create a string

const html = '<div></div>'
Copy the code

Now let’s write a regular expression that describes a tag. It has the following characteristics:

  1. In order to<At the beginning
  2. The label name contains English characters
  3. In order to</ Tag name >At the end

First edition expression

Const reg = / < (\ w +) > < \ \ (1) > / g / / verify HTML. The match (reg) / / '< div > < / div >'Copy the code

\1 Refer to the regular rule of the previous group as described above.

Look at our results. Perfect. But there’s a problem. Our labels are usually not written on a single line, with a \ N between the labels to indicate line breaks. Ok, let’s modify the HTML string

Const HTML = '<div> </div>' // Validates html.match(reg) // nullCopy the code

😰 is over, failed.

Don’t worry, we only matched the tag, not the content of the tag. Since. Does not match \n, we use other conditions to match.

Second Edition expressions

Const reg = / < > (\ w +) ([\ s \ s] *) < \ \ (1) > / g / / validation HTML. The match (reg) / / '< div > \ n \ n < / div >'Copy the code

(^o^)/ success.

And we can see that here. Use [\s\ s] to match \n. Because [\s\ s] stands for Spaces and non-spaces. The same usages are [\w\ w], [\b\ b], [\d\ d]

Third Edition Expressions

HTML in addition to the double tag and single tag, let’s look at the single tag verification. A single label starts the same as a double label. But the end is different. Single tags end with />. Anyway, let me write it down

Let reg / < = (\ w +) / / / / / end, at the beginning of the same rules of writing reg = / < (\ w +) \ / > /Copy the code

Note that we do not need to use [\s\ s] to validate the contents of a single tag, which would cause other problems because we need to match the attributes up to the end of the newline. So this is the way to write it.

reg = /<(\w+)([^>]*)\/>/
Copy the code

[^>] indicates that any object that is not the end sign is eligible. * indicates zero or more occurrences

Anyway, let’s check it out

const html = '<img src="" />'
html.match(reg) // <img src="" />
Copy the code

😄, success.

Combined version (not ultimate version)

Combined with version we need to use regular expressions metacharacters | as a branch of judgment

Const reg = / < > (\ w +) (([\ s \ s] *) < / \ \ (1) >) | (([^ >] *) \ / >)/mg / / (([\ s \ s] *) < / \ \ (1) >) double label / / (([^ >] *) \ / > tagCopy the code

Description:mIs used to indicate matching multiple rows.gIs meant to represent a global match

verify

const html = '<div title="1"></div><p></p><img src="asfs" alt="asdfa"/><br />'

html.match(reg)
// [ '<p></p>', '<img src="asfs" alt="asdfa"/>', '<br />' ]
Copy the code

Success!!!!!

Although this match is successful, there are still many problems with this expression, looking forward to readers to improve it.

6.2 Implementation of digital thousandths

Again, let’s analyze the requirements. The thousandth digit is every third digit with a comma. First, we create a numeric string

const str = '12345678'
Copy the code

See if the suffix is three digits that we use? =, which checks to see if the suffix matches the rule. Start by creating the re

The first edition

const reg = /(? =\d{3})/gCopy the code

Because we need to match the whole number, we use g to represent a global match. Ok, re created, verify.

STR. Replace (reg, ', ') / /, 1,2,3,4,5,678Copy the code

The whole result, there seems to be something wrong.

Take your time and look down, since we want the numbers to appear in pairs of three, we need to add one or more checks here. Use the +

The second edition

const reg = /(? =(\d{3})+)/gCopy the code

We’re not going to verify it, because we’re not done yet. Since every three characters is a node, you also need $to indicate the end of the search.

Second Edition improvement

const reg = /(? =(\d{3})+$)/gCopy the code

Validation:

STR. Replace (reg, ', ') / / 12345678Copy the code

Looks like it worked. Check it out a few more times.

…………………

There was a problem validating to this string.

Const STR = '123456789' str.replace(reg, ',') //,123,456,789Copy the code

And here we see that the number is exactly nine, so 123 would be good enough to start with. So you need to ban the beginning. Use? ! This way,

The third edition

const reg = /(? ! (^)? =(\d{3})+$)/gCopy the code

Verify again:

STR. Replace (reg, ', ') / / 123456789Copy the code

7. To summarize

Well, that’s all we have to say about regular expressions in this article. In this article, we share a few contents

  • metacharacters
  • Position matching
  • String and regular expression methods explained
  • Practical operation

Regular can be used in a variety of ways. The greater the so-called ability, the greater the harm, only really master it, in order to be handy in practical application, otherwise easy to cause no small disaster.