Regular expressions are the foundation of Javascript

Writing in the front

This article is all about example analysis. For regular expressions used in all articles, see:

30 minutes Introduction to regular expressions

The above article is very comprehensive and worth reading several times
What can regular expressions do?

1. Get the special characters between two specified strings

Suppose you have a long string and now want to extract all the characters between two special characters. The following expression can be implemented

(1) Get all the characters between spread and to
var str = 'The manufacturers spread the idea of the products to attract more people to purchase';
var reg1 = / (? <=spread).+(? =to)/; // Greedy mode

str.match(reg1)
// [" the idea of the products to attract more people ", index: 24, input: "The manufacturers spread the idea of the products to attract more people to purchase", groups: undefined]

var reg2 = / (? <=spread).+? (? =to)/; // Lazy mode

str.match(reg2)
// [" the idea of the products ", index: 24, input: "The manufacturers spread the idea of the products to attract more people to purchase", groups: undefined]

// (2) Get instance two of the special character between two specified strings
Copy the code

This regularity mainly involves three kinds of knowledge:
- Greed and laziness
- (? =exp) zero-width positive prediction predicate, which asserts that the position at which it appears matches the expression exp
- (? <=exp) zero width is a retrospective postassertion, which asserts that the position in front of its occurrence matches the expression exp
Both sides of the string intercepted by these two methods are whitespace, which can be removed manually if not needed

Here’s a regular that removes whitespace from both sides of a string:

2. Remove whitespace characters from both sides of the string

Take the result of the first example above as an example

// First mount a trim function on the String prototype

String.prototype.trim = function() {
    return this.replace(/(^\s*)|(\s*$)/g."");
}

var string1 = ' the idea of the products to attract more people '; // The result of the first instance above (greedy mode)
var string2 = ' the idea of the products '; // The result of the first instance above (in non-greedy mode)

string1.trim(); // 'the idea of the products to attract more people'
string2.trim(); // 'the idea of the products'
Copy the code

3. Check whether a word in the string ends with a special character

Again, use the end result of the above example as the data source

for'the idea of the products to attract more people'If you want to determine if there is a word in the string that ends in dea, the following re can determine:var string = 'the idea of the products to attract more people';
var reg = /\b.*(? =dea\b)/;
var reg1 = /\b.*(? =ade\b)/;

reg.test('the idea of the products to attract more people'); // true
reg1.test('the idea of the products to attract more people'); // false

// We know that the string has the word idea, so the result is true, but there is no word ending in dea, so the test function of reg1 runs false
Copy the code

4. Filter JS and EXE files

If you want to filter js files and exe files, you can create an expression that does not match any string whose suffix is.js or.exe. Any other condition matches as follows:

var reg = / ^ ^.] + $| \. (? ! (js|exe)$)(? ! .*\.(js|exe)$)|^.{0}$/; // It's hard to understand
var reg1 = / ^ ^.] + $| \. (? ! (js|exe)$)([^.]*? $) /; // This is easier to understand
var reg2 = / ^ ^.] + $| \. (? ! .*(js|exe)$)|^.{0}$/; // This is the easiest to understand

// Take a look at some examples
'foo.js'.search(reg); // -1 corresponds to the false of the test method
'bar.exe'.search(reg); // -1 corresponds to the false of the test method
'bar.exee'.search(reg);  // 3 equals the true of the test method
'js.foo'.search(reg); // 2 equals the true of the test method
'js.foo'.search(reg); // 2 equals the true of the test method
'foobar'.search(reg); // 0 equals the true of the test method
' '.search(reg); // 0 equals the true of the test method
'foo.js.js'.search(reg); // -1 corresponds to the false of the test method
'foo.js.jss'.search(reg); // 3 equals the true of the test method
'.js.'.search(reg); // 0 equals the true of the test method
'sj.jsj'.search(reg); // 2 equals the true of the test method

// To facilitate parsing, the expression is split into three large parts, which are further divided into three small parts in part 2

/ / (1.0) ^ ^. + $
/ / (2.0) \. (? ! (js|exe)$)(? ! .*\.(js|exe)$)
/ / / (2.1).
/ / ((2.2)? ! (js|exe)$)
/ / ((2.3)? ! .*\.(js|exe)$)
/ / (3.0) ^. {0} $
Copy the code

This expression needs to be looked at separately:
- Step 1: Check whether the string contains. Symbol, no direct match, this is the role of part [1.0]
- Step 2: In this step we assume that the expression must have. Symbol, which matches first. Symbol, which is the function of part [2.1]
- Step 3: If the. Symbol is followed by exe or js, and there is no character after js or exe, that is to say, js or exe is the tail, then it is an exe file or JS file, we must not match, this is the role of part [2.2]
- Step 4: If there are multiple. Symbol, we should take the last one. Symbol prevails, which is the function of part [2.3]
- Step 5: If the empty string cannot be handled after all the above steps are written, it needs to be handled separately. This is the function of part [3.0]
Note: The technique used here to determine if a string does not contain a character of the specified type is useful

var str = 'foo3bar';
var str1 = 'foobar';
/^[^\d]+$/.test(str);  // false
/^[^\d]+$/.test(str1);  // true
Copy the code

5. Check whether the string contains the specified string

Check whether the url contains a specific string or not.

Requirement 1: Check whether the following URL contains resizeFlag

var foo = "http://localhost:8002/Home/Login#durationmgr/dailyTask/workPanel? resizeFlag=false";
var bar = "http://localhost:8002/Home/Login#durationmgr/dailyTask/workPanel";

var reg = /resizeFlag/; // This is really simple and effective, but it can only be used for the simplest matches

reg.test(foo); // true
reg.test(bar); // false

// Let's look at another way
var reg1 = / ^ (? = (. *? resizeFlag).*$/; // Forward look, match the string containing resizeFlag
var reg2 = / ^ (? ! . *? resizeFlag).*$/; // Reverse lookahead to match the string without resizeFlag

reg1.test(foo); // true
reg1.test(bar); // false
reg2.test(foo); // false
reg2.test(bar); // true

// Requirement 2: Check whether the following URL contains resizeFlag. If yes, extract from # to? Any character between the

var foo = "http://localhost:8002/Home/Login#durationmgr/dailyTask/workPanel? resizeFlag=false";
var bar = "http://localhost:8002/Home/Login#durationmgr/dailyTask/workPanel? flight=1";

// Solution 1, separate judgment and fetching specific characters
function getUrl(url, flagString) {
    if(! url || ! flagString) {return new Error('getUrl need a url');
    }
    
    var urlT = url,
        flagStringT = flagString,
        reg = new RegExp(flagStringT),
        reg1 = / ^. * # (. *) \? {1}. * $/;
        
    if(! reg.test(urlT)) {return;
    }
    
    return reg1.exec(urlT)[1];
}

getUrl(foo, 'resizeFlag') // "durationmgr/dailyTask/workPanel"
getUrl(bar, 'resizeFlag') // undefined

// Solution 2, put the judgment and extract specific characters in an expression
var reg2 = / ^. * # (. *) \? {1} (? = (. *? resizeFlag).*$/;

reg2.exec(foo)[1]; // "durationmgr/dailyTask/workPanel"
reg2.exec(bar) // null

// The code for solution 2 is much cleaner, which is the power of forward looking
Copy the code

6. Verify password strength

Verifying password strength is very useful for form validation. The verification of password strength can be completely left to the front end. Here is a common example of verifying password strength:

// Requirements: The password must contain only one uppercase letter, one lowercase letter, one digit, and one special character. The password must be 8 to 16 characters long

var reg = / ^ (? =.*\d)(? =.*[A-Z])(? =.*[a-z])(? =. * [! @ # $% ^ & *? \ (\)]) $/ dec {8};
var reg1 = / ^. * (? (= ^. 16th {8} $)? =.*\d)(? =.*[A-Z])(? =.*[a-z])(? =. * [! @ # $% ^ & *? \ (\)]) * $/;
var reg2 = / (? (= ^. 16th {8} $)? =.*\d)(? =.*[A-Z])(? =.*[a-z])(? =. * [! @ # $% ^ & *? \] (\)) /;

/ / test reg
reg.test('123abcDEF$'); // true
reg.test('123abcccc$'); // false does not contain uppercase letters
reg.test('123DDDDDD$'); // false does not contain lowercase letters
reg.test('ddddDDDfff$'); // false contains no numbers
reg.test('ddddDDDfffdd12'); // false contains no special characters
reg.test('1dDDff$'); False Contains less than 8 bits
reg.test('1ddDDff$1234567891234'); False Contains more than 16 bits

/ / test reg1
reg1.test('123abcDEF$'); // true
reg1.test('123abcccc$'); // false does not contain uppercase letters
reg1.test('123DDDDDD$'); // false does not contain lowercase letters
reg1.test('ddddDDDfff$'); // false contains no numbers
reg1.test('ddddDDDfffdd12'); // false contains no special characters
reg1.test('1dDDff$'); False Contains less than 8 bits
reg1.test('1ddDDff$1234567891234'); False Contains more than 16 bits

// reg2 has the same effect as above
Copy the code

The key to all three expressions is the use of forward lookforward. The difference is that reg1 also uses forward lookforward for string length, while REg uses normal mode for string length

The three expression patterns above apply to strings that satisfy multiple conditions at the same time, similar to the ampersand (&) operation

Two questions worth considering: (1) What is the difference between REG and REG1? (2) The difference between REG1 and reg2?

7. Match date format such as “2019-01-02”, “2019/02/03”, “2019.03.04”

This date format is more fixed, the number {4} hyphen number {1 to 2} hyphen number {1 to 2}, the expression is relatively simple to write, in order to write

// Add the instance directly
var reg = /^\d{4}(\-|\/|\.) 1 \ \ d {1, 2} \ d {1, 2} $/;

reg.test('2019-01-02'); // true
reg.test('2019-1-02'); // true
reg.test('2019-01-2'); // true
reg.test('201-01-02'); // false The year is less than four digits
reg.test('201-01-02'); // false The fourth digit is not a number

// Take a look at this expression bug
reg.test('2019-00-01'); // true the month can be 0
reg.test('2019-01-00'); // True day can be 0
reg.test('0000-01-02'); // true the year can be 0

// Improved version, but not significant, still does not limit the size of month and date
var reg1 = / ^ (? ! (^ 0000. * $)? ! ^. {5} 00. (* $)? ! ^.*00$)\d{4}(\-|\/|\.) 1 \ \ d {1, 2} \ d {1, 2} $/;
reg1.test('2019-01-02'); // true
reg1.test('2019-1-02'); // true
reg1.test('2019-01-2'); // true
reg1.test('2019-00-01'); // false
reg1.test('2019-01-00'); // false
reg1.test('0000-01-02'); // false

// the final version takes into account all cases, but limits the dead link to - and the month and day format to two digits (01,02,03,04,....)
var reg = / ^ (? : (? ! 0000) [0-9] {4} - (? : (? : 0 | [1-9] [0-2] 1) - (? : 0 [1-9] [0-9] | | 1 2 [0 to 8]) | (? : 0 [9] 13 - | [0-2] 1) - (? 30) : 29 | | (? : 0 [13578] 1 [02]) - 31) | | (? : [0-9] {2} (? : 0 [48] | [2468] [048] | [13579] [26]) | (? : 0 [48] | [2468] [048] | [13579] [26]) $/ 00) - 02-29);

reg.test('2019-01-02') // true
reg.test('2019-02-29') // false There is no 29th in February
reg.test('2019-02-28') // true
reg.test('2019-13-02') // false has no 13 months
reg.test('the 2019-08-3 2'); // false has no 32
reg.test('2019-08-31'); // true
Copy the code

8. Implement template string replacement function

Template string replacement is very common, EJS template engine, Vue template string replacement, ES6 template string replacement, here we implement a basic template string replacement function

var reg = 'I am a {job}, Do you like my {book} and {job}',
    data = {
       job: 'teacher'.book: 'Javascript Advanced Programming '};function renderTpl(reg, data) {
    if(! reg) {return new Error('reg is need');
    }

    var arr = reg.match(1 / \ {+? \}/g);

    if (arr && arr.length <= 0) {
        return;
    }

    var arr = arr.map(function(item) {
        return item.replace(/\{|\}/g.' ');
    })

    for (var i = 0, length = arr.length; i < length; i ++) {
        var item = arr[i];
        var regExpr = '/ \ {' + item + '\}/g';
        reg = reg.replace(eval(regExpr), data[item]);
    }

    return reg;
}

renderTpl(reg, data); // "I am a teacher, Do you like my Javascript advanced programming and teacher"
Copy the code

9. Implement millennial labeling

The need to implement millennial tagging is common in financial trading software and is very effective for displaying large numbers

    function addSeparator(str, sep) {
        if(! str) {return ' ';
        }
        var strT = str + ' ',
            sepT = sep || ', ',
            arr = strT.split('. '),
            re = /(\d+)(\d{3})/;

        var integer = arr[0],
            decimal = arr.length <= 1 ? "" : '. ' + arr[1];

        while (re.test(integer)) {
            integer = integer.replace(re, "$1" + sepT + "$2")}return integer + decimal;
    }

    console.log(addSeparator(-1987654321.23)); / / - 1987654321.23
Copy the code

This requirement also creates a problem, because the actual display is no longer the numbers

If the page needs to evaluate the number again, it doesn’t get the desired value, so you can use the attributes of the HTML tag to store the original value before implementing the thousandth annotation

Such as:

<span data-amount='1987654321.23'>-1.987.654.321.23</span>
Copy the code

10. Loop through the exec method

The exec method is iterated to achieve the same “match all” functionality as the match method, with more comprehensive information

var x = "a.xxx.com b.xxx.com c.xxx.com";
var reg = / (. *?) \. (? :. *?) \.com\s?/g; // Note 1: there must be a global modifier g

var arr = [], arr2 = [];

while (item = reg.exec(x)) { // Note 2: here we can use the result returned by exec execution as a condition to end the loop
    arr2.push(item);
    arr.push(item[1]);
}

console.log(arr);
// ["a", "b", "c"]

console.log(arr2);
<!--
["a.xxx.com "."a".index: 0.input: "a.xxx.com b.xxx.com c.xxx.com".groups: undefined]
["b.xxx.com "."b".index: 10.input: "a.xxx.com b.xxx.com c.xxx.com".groups: undefined]
["c.xxx.com"."c".index: 20.input: "a.xxx.com b.xxx.com c.xxx.com".groups: undefined]
-->

console.log(x.match(reg));
// ["a.xxx.com", " b.xxx.com", " c.xxx.com"]

var reg1 = /([a-zA-Z])(? = \. (? :[a-zA-Z]+)\.com\s?)/g;

console.log(x.match(reg1));
// ["a", "b", "c"]
Copy the code

11. Determine if it is a decimal (cannot contain characters other than digits and decimal points)

function isDecimal(strValue) {
    var objRegExp= /^\d+\.\d+$/;
    return objRegExp.test(strValue);
}
isDecimal(3.14); // true
isDecimal("q3.14"); // false
isDecimal("3.14 q"); // false
isDecimal(3.); // false
isDecimal(3.); // true in js,.2 === 0.2
Copy the code

12. Judge whether it consists of 2 to 4 Chinese characters

function isChina(strValue) {
    var objRegExp= / ^ \ s * [\ u4E00 - \ u9FA5] {2, 4} \ s * $/; // Allow Spaces on both sides
    return objRegExp.test(strValue);
}
Copy the code

13. Transpose strings

var name = "Doe, John"; 
name.replace(/(\w+)\s*, \s*(\w+)/."$2 $1"); 

// The emphasis here is on using the $sign in the replace method to indicate the grouping and artificially changing the grouping position
//=> "John Doe"
Copy the code