The paper

Regular expressions are great tools for manipulating strings and improving productivity. A good regular can save you dozens or even hundreds of lines of code. At work, you may see a lot of regular strings in your code, or you may see no regular strings at all. The reason is that people who know regular strings tend to use regular strings first, and those who don’t have regular strings must use a lot of APIS. And in the interview, many students often hang in the regular here, so for the front-end regular is a necessary skill. This article will not only teach you how to use it, but also how it works so you can use it and interview it without fear.

Basis.

Var reg = /^\d$/g var reg = /^\d$/g var reg = /^\d$/g

1. The modifier

  • g– global Indicates a global match (the match will continue until no match is found).
  • i– ignoreCase Ignores case
  • m– multiline Matches multiple lines

2. Metacharacters

Metacharacters are divided into special metacharacters and ordinary metacharacters. Ordinary metacharacters include elements such as arrays and letters.

  • \Escape characters (to convert a normal character to a special character, or to convert a meaningful character to a normal character)
  • .Any character except \n (newline character)
  • \dMatches a number between 0 and 9
  • \DMatches a number that is not between 0 and 9 (uppercase and lowercase combinations have opposite meanings
  • \wMatches a character between 0 and 9 or between letters and underscores
  • \sMatches an arbitrary whitespace character
  • \bMatches a boundary character
  • x|yMatches one of x or y
  • [a-z]Matches any character from a to Z
  • [^a-z]Matches any character other than a-z
  • [xyz]Matches a character in x or Y or Z
  • [^xyz]The opposite of the top
  • (a)Match a small group (i.e. a small re within a large re)
  • ^Start with a metacharacter
  • $Ends with a metacharacter
  • ? :Match only, no capture
  • ? =Positive positive pre-check
  • ? !Positive negative pre-check

3. The quantifiers

Quantifiers are used to describe the number of metacharacters as follows:

  • +Make the preceding metacharacter appear one or more times
  • ?Zero to one occurrence
  • *Zero to multiple occurrences
  • {n}A n time
  • {n,}N to multiple occurrences
  • {n,m}N to m occurrences

See here you have some dizzy, these strange symbols, exactly how to use, the following will take you to apply these characters, the use of the students to do bear in mind that in mind before remember these characters, wants to play regular must bear in mind is that they, when you bear in mind that these characters at the same time you have already started.

Metacharacters

  • 1. Start metacharacter ^

    Match the metacharacter of the revealed position, for example: var reg = /^2/; If the start metacharacter is placed inside [], for example, [^] indicates the opposite meaning

  • 2. End the metacharacter $

    Match the end metacharacter, for example, var reg = /2$/; Var reg = /^2$/; The value can only be 2 because 2 represents only one metacharacter.

  • 3. Escape character \

    To convert a special metacharacter to a normal character, for example, var reg = /^2.3$/. But in this. Var reg = /^2\.3$/ special metacharacter = /^2\.3$/ special metacharacter = /^2\. Escape to a true. Element, rematch only 2.3 will match successfully think about it as follows:

    var reg1 = /^\d$/
    var reg2 = /^\\d$/
    var reg3 = /^\\\d$/
    var reg4 = /^\\\\d$/
    Copy the code

    Reg1 is the number between 0 and 9, so the match between 0 and 9 will be successful. Reg2 is the number between 0 and 9, so the match between 0 and 9 will be successful. \\[0-9] means either \ or 0-9. The reason is that the first \ escapes the second one, so \\d is split into two parts, \\ is a common character, \\ is a common character. \\[0-9] reg4 is a number between 0 and 9, so the correct answer is \\[0-9] reg4. Many students will think that the correct answer must be \\\[0-9], unfortunately, the correct answer is \\\\d, because when the first one escapes the second character, the third one escapes the fourth character. If you write var reg = /^$/, you will get an error. If you write var reg = /^$/, you will get an error. If you write var reg = /^$/, you will get an error. You need to write at least two \, for example: var reg = /^\ $/ the correct matching result is ‘\\’, so remember that two \ represents the real \.

  • 4. Or metacharacters x | y

    This metacharacter is easily understood as matching x or matching y. Here’s a quick example:

    var reg = / 2 | 3
    // Match the number 2 or 3
    Copy the code
  • 5. Small groups ()

    The main feature of the metacharacter () is that it treats the regular rule in parentheses as a whole, equivalent to a small independent regular, dealing with the problem in the following example

    var reg = 18 | $19 / / ^
    
    // This example starts with 18 or 19 but ends with? Does it really only match 19? It's not
    // The correct match is 181, 189, 819
    
    var reg = $/ / ^ 18 | (19)
    
    // Use () to wrap 18 or 19 around each other
    // This regex only matches the beginning and end of 18 or 19 instead of 181 and 189
    Copy the code
  • 6. Grouping references \n

    The concept of a grouped reference is that you can refer to small rules specified in a large re, for example:

    var reg = /^([a-z])([a-z])\2([a-z])$/
    // The required character string is as follows: book week HTTP... ​
    Copy the code

    In the example above, \2 represents the full reference of the second minor regular rule, which, like the second minor regular rule ([a-z]), can reduce the complexity of the regular and handle the rule of multiple repetitions

  • 7. Match character []

    [xyz], [^xyz], [A-z], [^ A-z], [^ A-z

    var reg = /^[a-zA_Z0-9_]$/
    // This regular sum is equivalent to \w, which matches a character between 0 and 9 or a letter or _
    // Xyz in regular [xyz] represents a-z, A_Z, 0-9, respectively. Xyz is just a representation identifier, and can have xyzhw various combinations
    // As in this example, there are four matches like underscore _
    // Special metacharacters in [] usually represent their own meanings, as follows
    var reg = / ^ [.? + &] $/
    // represents a match.? ? +... , etc.
    Copy the code
  • 8. Boundary character \b

    Match a word boundary, i.e. the position between the word and the space (the boundary is mainly the left and right sides of the word) for example:

    var reg = /er\b/
    // Match er in never, but not er in verb
    var reg = /\b\w+\b/g
    // Can match alphanumeric and underscore with word boundaries 'my blog is www.ngaiwe.com'
    // Can match 'my', 'blog', 'is', 'WWW', 'ngaiwe', 'com'
    Copy the code
  • 9. Match but no capture? :

    Here is an example, but when it comes to capturing content, if you don’t understand, you can skip here first, watch the following capture, and then come back

    var reg = /^(\d{6})(\d{4})(\d{2})(\d{2})\d{2}(\d)(\d|X)$/g
    var str = '110105199001220613'
    console.log(reg.exec(str))
    / / print the results for "110105199001220613", "110105", "1990", "01", "22", "1", "3"
    
    var reg = /^(\d{6})(? :\d{4})(\d{2})(\d{2})\d{2}(\d)(\d|X)$/g
    / / print the results for "110105199001220613", "110105", "01", "22", "1", "3"
    // The second small group will only be matched, not captured
    Copy the code
  • 10. Positive affirmations? =

    This is a difficult concept to understand. It is used to match whether the following elements of an element conform to the rule, but does not consume the rule. Example 1:

    var reg = /windows(? =95|98|NT|2000)/
    var str1 = 'windows2000'
    var str2 = 'windowsxp'
    
    console.log(reg.test(str1))
    console.log(reg.test(str2))
    Str1 is true str2 is false
    
    console.log(reg.exec(str1))
    console.log(reg.exec(str2))
    // Str1 can be captured and Windows does not capture 2000 as well
    // Forward pre-check matches only corresponding rules
    Copy the code

    Example 2:

    var reg1 = /win(? =d)dows/
    var reg2 = /win(d)dows/
    var str = 'windows'
    console.log(reg1.test(str))
    console.log(reg2.test(str))
    // reg1 returns true reg2 returns false
    // The reason is that the forward lookup only matches, and does not consume characters, that is, it does not match the characters of the rules inside
    // reg1 matches Windows and win
    // reg2 matches winddows
    Copy the code
  • 11. Trying to deny a pre-test? !

    In contrast to positive pre-check, matches a re that does not match the rule

Two ways to create a re

Re can be created in two ways: literals and constructors

  • 1. Literal creation

var reg = /\d+/img

  • 2. Constructor creation

Var reg = new RegExp(‘\\d+’, ‘img’) var reg = new RegExp(‘\\d+’, ‘img’

In general, regular metacharacters need to be created dynamically using constructors, because metacharacters are string concatenation, while regular fixed write dead regulars need to be created literally. Example:

var stringMode = 'string'
var reg = new RegExp(`^\\[object ${stringMode}$` \ \])
console.log(reg.toString())
Copy the code

4. Regular prototype method

Let’s have a look at the methods of re:

You can see that there are only three methods on the regular prototype object, Exec Test and toString

  • 1. Exec This method is designed primarily for use with capture groups, and the arguments are strings to match. As shown in the figure:

    Capturing principle

    • 1. Verify that the current string and the re match, return null if they do not match.
    • 2. If the match starts from the left of the string, search to the right and return the match

    Capture the result

    • The result is an array
    • 2. The first item 0 is the result of the match in the current grand re
    • 3.indexIs the index position of the matched result in the string
    • 4.inputThe original string for the current re operation
    • 5. If there are groups in the grand re(a), which is the result of each small grouping from the second item in the array
    • Here is an id card regular example for reference, specific matching rules will be introduced separately below, here only to learn the meaning of the field

  • 2. Lazy

    Regular capture is lazy. In the above exec, only the first content that meets the rule is captured by executing the exec once, and the first content is also captured by executing the exec twice, and the following content cannot be captured no matter how many times it is executed

    The solution

    Add the modifier G to the end of the re (global match)

    The principle of

    The default value is 0. The search starts at the first position in the string. Therefore, lastIndex does not change after exec execution, and manually changing lastIndex does not work. Browser doesn’t recognize

    Adding the g modifier solves this problem, because after each exec, the browser defaults to changing lastIndex, starting where it last left off, so you can get what follows

    Let’s manually create a function that matches everything and captures it, as follows:

    var reg = /^(\d{6})(\d{4})(\d{2})(\d{2})\d{2}(\d)(\d|X)$/g
    var str = '110105199001220613'
    
    RegExp.prototype.myExec = function myExec() {
    	var str = arguments[0] | |' '
    	var result = []
    	// First this refers to RegExp, so check whether this has the global modifier g
    	// If not, to prevent an infinite loop, we just execute exec once and return it
    	
    	if(!this.global) {
    		return this.exec(str)
    	}
    	
    	var arrs = this.exec(str)
    	while(arrs) {
    		result.push(arrs[0])
    		// Now the value of lastIndex has changed to the end of last time
    		arrs = this.exec(str)
    	}
    	return result
    }
    Copy the code

    This method returns the result of a large reg match if the reg modifier g is added, or exec capture if g is not added

  • 3. Test This method is mainly used for regular matching, but can also be used for capturing

    Matches from the left side of the string to a character that matches the regular rule, returns true, otherwise returns false. Similarly, test modifies lastIndex under the g modifier

    usingtestTo achieve capture

    var reg = /\{([a-z]+)\}/g
    var str = 'my name is {weiran}. I am from {china}'
    var result = []
    while (reg.test(str)) {
    	result.push(RegExp. $1)}console.log(result)
    // ['weiran', 'china']
    Copy the code

    Return false if test matches to the end or does not, and if successful add the current subgroup to the array to match the contents of the first element. Constructor for RegExp exists $1-$9, which refer specifically to the first through ninth captured contents of the current subgroup

  • 4.toString

    That is, converting a regular expression to a string

5. String regular method

  • 1.match

    It is also a capture method, as shown in the figure:

    When adding a modifier g, returns the large array of regular matching results, do not add modifiers g returns large regular and each small group consisting of an array, like our handwritten myExec above, actually principle is the way we write, if you want to return results when added modifiers g and not add, Therefore, the match method can be directly used to capture all the matching content. However, it also has limitations, just like the above mentioned in the addition of modifier G, will ignore the small group to capture the content, only capture the content of the large re, the solution is similar to myExec above, change arRS [0] to ARRS, in each match result, also save each small group.

  • 2.replace

    The first parameter is the string or re to be replaced, and the second parameter is the content or return function to be replaced. The operation is as follows:

    var str = 'my name is {weiran}, my blog is {ngaiwe}'
    var reg = /\{([a-z]+)\}/
    str = str.replace(reg, '123')
    console.log(str)
    // Print my name is 123, my blog is {ngaiwe}
    // Students will notice that the lazy behavior of exec is similar to that of g, which matches the first lastIndex without modifying g
    
    var reg = /\{([a-z]+)\}/g
    // Print my name is 123, my blog is 123
    Copy the code

    And replace does not modify the original string

    var str = 'my name is {weiran}, my blog is {ngaiwe}'
    var reg = /\{([a-z]+)\}/g
    str = str.replace(reg, function () {
    	console.log(arguments)})// Prints the currently matched small grouping, or undefined if no replacement value is returned
    Copy the code
  • 3.split

    In accordance with the regular way can be divided into arrays, specific examples are as follows:

    var str = 'weiRanNgaiWe'
    var reg = /[A-Z]/
    console.log(str.split(reg))
    // ["wei", "an", "gai", "e"] split into arrays in uppercase
    Copy the code
  • 4.search

    Similar to indexOf, returns the starting position of the matched element. If -1 is not returned, the g modifier is not supported

    var str = '[email protected]'
    var reg = /\d+/
    console.log(str.search(reg))
    / / return 7
    Copy the code

Six. Actual case analysis

  • 1. Id card number

    Take an id card example: 110105199109214237 from this analysis, the first six digits are composed of digits, the area code, then the four digits are year, two months, two days and four random, the penultimate singular male, even female, and the last digit may be a capital X, so the regular according to this rule is

    var str = '110105199109214237'
    var reg = /^(\d{6})(\d{4})(\d{2})(\d{2})\d{2}(\d)(\d|X)$/
    console.log(reg.exec(str))
    // ["110105199109214237", "110105", "1991", "09", "21", "3", "7", index: 0, input: "110105199109214237", groups: undefined]
    Copy the code
  • 2. Email

    For example, the email address [email protected] can be preceded by digits, letters, underscores (_), hyphens (-), periods (.), and hyphens (-). Can’t connected together / ^ \ w + ((- | \ w +) | (. \ w +)) * / opening must be digital, letter or an underscore, later may be – and figures underlined or letters. And alphanumeric underscore characters from 0 to more than 3. The @ part must first be alphanumeric multi-digit characters and then may exist is. Constituting the suffix of a mailbox or linking the preceding characters. And. – It has to be. Mailbox suffix

    var reg = /^\w+((-|\w+)|(\.\w+))*@[a-zA-Z0-9]+((\.|-)[a-zA-Z0-9]+)*\.[a-zA-Z0-9]+$/
    var str = '[email protected]'
    console.log(reg.test(str))
    // true
    Copy the code
  • 3. The interception of the URL

    For an example of intercept url parameters: www.ngaiwe.com/page/index…. {name: ‘weiran’,age: 27, sex: 0, HASH: ‘develpoment’} The following argument, the second capture of the hash value after # first matches the first one, and its rule is to match both sides of the equals sign so /()=()/, and match yes or no? Special characters for ampersand =#, save them in an obj object and then match hash, just like the first one but just match the part after #

    String.prototype.myQueryURLParameter = function myQueryURLParameter () {
    	var obj = {}
    	this.replace(/([^?&=#]+)=([^?&=#]+)/g.function () {
    		obj[arguments[1]] = arguments[2]})this.replace(/#([^?&=#]+)/g.function () {
    		obj['HASH'] = arguments[1]})return obj
    }
    Copy the code

7. To summarize

There are all kinds of regular cases on the Internet. Students can analyze how they write when they see the cases in the future. They should not use them blindly, so that they can keep thinking and learning in their work.

Blog eight.

Wei Ran technology blog

If you have any questions, please leave a message or send me to [email protected]

Share nine.

Regular metacharacter list regular online test tool