Today is June 1 children’s day ah, first of all wish the children, happy holidays oh, do not contain those who are very big also installed children’s person oh, ha ha ha, joke, mentality young you forever young ah, happy holidays happy holidays. If it were not for the fear of beating who wish you ah, cut me open… Give tea to bigwigs. Bigwigs are always young. Back to the point, the last article we told about the regular expression moves one move two, so today we will continue to look at the moves three or four, said that the moves actually feel a little bad, because the master said that the fight is internal work, forget it, after all, I am a rookie.

  • Move 3: Regular expression parentheses function

    1. Grouping and branching structures

      • Grouping: For example, if we want to match a consecutive character, we can use /a+/, and if we want to match a consecutive character, we need to use parentheses, we need to use /(ab)+/, where parentheses are applied to the whole of ab

      • Branching structure: we have seen the branching structure in front of it with the parentheses (P1 | P2), the role of the brackets here is self-evident, provides the expression of all possible.

        var regex = /^I love (javascript|Regular Expression)$/g;
        console.log(regex.test("I love javascript"));
        console.log(regex.test("I love Regular Expression"));
        //true
        //true
        Copy the code
    2. Capture the packet

      • And the other thing that the parentheses are used for is to extract data, to make it easier to replace

        var regex = /\d{4}-\d{2}-\d{2}/;
        // Can be changed to parenthesis version:
        var regex = /(\d{4})-(\d{2})-(\d{2})/;
        Copy the code

        Why parentheses? Why don’t we talk about the benefits? The first element is the result of the whole match, then the contents of each group (parentheses) match, then the subscript, and finally the output text. Then it is easy to extract data. And use the constructor’s global properties 1−1-1−9 to get console.log(RegExp.$1); / / “2020”

        var regex = /(\d{4})-(\d{2})-(\d{2})/;
        var string = "2020-06-01";
        console.log(string.match(regex));
        //["2020-06-01", "2020", "06", "01", index: 0, input: "2020-06-01", groups: undefined]
        Copy the code
      • There’s no doubt that getting affirmations is a replacement

        var regex = /(\d{4})-(\d{2})-(\d{2})/;
        var string = "2020-06-01";
        var result = string.replace(regex,function(){
          return RegExpThe $2 +"/" + RegExp. $3 +"/" + RegExp. $1; });console.log(result);
        / / 06/01/2020
        Copy the code
    3. backreferences

      • I’m sure you’re confused about what a backreference is, so we take what the re matches and we replace it, that’s called a reference, and then we take what the re matches and we take it back and we match it again and that’s called a backreference, which is taking what was captured and referencing it. This is not easy to understand, we still serve food directly, cui hua to top pickled cabbage.

        2020-06-01 2020/06/01 2020.06.01
        var regex = /\d{4}(-|\/|\.) \d{2}(-|\/|\.) \d{2}/;
        var string1 = "2020-06-01";
        var string2 = "2020/06/01";
        var string3 = "2020.06.01";
        console.log(regex.test(string1));
        console.log(regex.test(string2));
        console.log(regex.test(string3));
        //true
        //true
        //true
        //========== stop this is a serious split line, have you ever thought "2020-06/01" would match?
        var string4 = "2020-06/01";
        console.log(regex.test(string4));
        //true is also matched
        Copy the code
      • So what? At this time you will certainly think that if the front side matches to what behind also matches to what is good, can! Js is so powerful how can you not think, backreference to come

        var regex = /\d{4}(-|\/|\.) \d{2}\1\d{2}/;
        var string1 = "2020-06-01";
        var string2 = "2020/06/01";
        var string3 = "2020.06.01";
        var string4 = "2020-06/01";
        console.log(regex.test(string1));
        console.log(regex.test(string2));
        console.log(regex.test(string3));
        console.log(regex.test(string4));
        //true
        //true
        //true
        //false is already implemented
        Copy the code
      • How to do with parenthesis nesting ???? Remember a parenthesis is a group, 1 is the outermost, 1 is the outermost, 2 is the next layer, $3 is the next layer.

        var regex = /^((\d)(\d(\d)))\1\2\3\4$/;
        var string = "1231231233";
        console.log(RegExp. $1);The first layer / / 123
        console.log(RegExp. $2);// first (/d)
        console.log(RegExp. $3)/ / 23
        console.log(RegExp. $4)/ / 3
        Copy the code
      • What does \10 mean? The tenth group? Or \1 and 0? Obviously the former.

    4. Uncaptured grouping

      All previous groups capture the data they match for subsequent reference, so they are called captured groups.

      If you only want the primitive properties of parentheses, but don’t reference them and use them as storage, you can use non-capture grouping (? :p), let’s look at another chestnut.

      var regex = / (? :ab)+/g;
      var string = "ababa abbb ababab";
      console.log(string.match(regex));
      // ["abab", "ab", "ababab"]
      Copy the code
    5. Try some chestnuts

      • The trim method emulates the string trim method by removing whitespace at the beginning and end of the string

        function trim(str){
          return str.replace(/^\s+|\s+$/g.' ');
        }
        console.log(trim(" foobar "));
        // foobar
        Copy the code
        function trim(str){
          return str.replace(/^\s*(.*?) \s*$/g."$1");   
          // dot is any character,* is 0 to infinite length,? Non-greedy model
          // \s stands for whitespace
          / /. *? A is to take the previous arbitrary length of the character, the last a appears, match the following capacity
        }
        console.log(trim(" foobar "));
        // foobar
        Copy the code
      • Capitalize the letter of each word

        function titleize(str){
          return str.toLoewCase().replace(/? :^|\s\w/g.function(c){
            returnc.toUpperCase(); })}Copy the code
      • Hump,

        function camelize(str){
          return str.replace(/[-_\s]+(.) ? /g.function(match,c){
          	return c ? c.toUpperCase() :""; })}console.log(camelize("-moz-transform"))
        Copy the code
  • Move 4: regular expression backtracking principle

    A look at the word backtracking will certainly feel more lofty, the programmer himself is like 13 ha ha ha, indeed this word appears many times, so do you understand?

    • There is no backtracking match

      Suppose our re is /ab{1,3}c/, which looks like this visually:

      When the target string is “abBBC”, there is no “backtracking”. The matching process is as follows:

      The subexpression b{1,3} indicates that the “b” character occurs 1 to 3 times in a row.

    • There is a backtracking match

      If the target string is “abbc”, there is a backtrace

      Figure in step 5 in the red areas, said matching is not successful, at this point has been matched by two character “b”, ready to try to match the third, the results showed that the third is to match the “b”, “c”, also means that b {1, 3} is matched, the state should be back before the match to the two “b” state of the characters, So I’m going to go back to step 4, and then I’m going to match C and I’m going to match c. The sixth step above is called backtracking.

      Go ahead and lift it again. It’s hard to lift this thing day by day

      The string is “abBBC “, and the matching process is

      Step 7 and step 10 are backtracking. Step 7 is the same as step 4, in which b{1,3} matches two b’s, and step 10 is the same as step 3, in which b{1,3} matches only one “b”, which is the final result of b{1,3}.

    • Common form of backtracking

      In simple terms, backtracking method is: because there are many possibilities, we need to try one by one until a certain step, the overall matching is successful, or at the end of the test and found that the overall matching is not successful, in short, all changes do not leave a word: try! So let’s take a look at where backtracking occurs in a regular expression.

      • Greed quantifiers

        {1,3} this is the paragon of greedy quantifiers, because greedy ah, no matter how much they keep eating, trying to eat down the order, as long as it is possible to keep trying, really can not eat and then spit out, ah, ah, ha, ha, ha, ha, so greedy, but can eat eat. This is greed quantifier, but there is a special thing about it. What happens when multiple greed quantifiers collide? Well, that’s social law. You get it first, then you don’t get it! Chance always comes to those who are prepared.

        var string = "12345";
        var regex = / (\ d {1, 3})/(\ d {1, 3});
        console.log(string.match(regex));
        // ["12345", "123", "45", index: 0, input: "12345", groups: undefined]
        Copy the code

        /d{1,3} = 123; /d{1,3} = 45;

      • Lazy quantifiers

        As we said earlier, lazy quantifiers are greedy quantifiers followed by a question mark, right? , means as few matches as possible,

        var string = "12345";
        var regex = / (\ d {1, 3}? / (\ d {1, 3});
        console.log(string.match(regex));
        // ["1234", "1", "234", index: 0, input: "12345", groups: undefined]
        Copy the code

        Including \ d {1, 3}? Only one character “1” was matched, and the following \d{1,3} matched “234”.

        var string = "12345";
        var regex = / ^ (\ d {1, 3}?) (\ d {1, 3}) $/;
        console.log(string.match(regex));
        Copy the code

        Although lazy quantifiers are not greedy, they can backtrack.

        Target string “12345”, matching process is:

        \d{1,3}? The matching character is “12”, which is two numbers, not one.

      • Branching structure

        We know the branch is inert, such as/can | candy /, to match the string “candy”, the result is “can”, because the branch will be a a try, if satisfied, in front of behind will not test again.

        Branch structure, perhaps the previous subpattern will form a local match, if the following expression does not match the whole, continue to try the remaining branches. This attempt can also be seen as a kind of backtracking.

        The target string is “candy” and the matching process:

        Step 5 above, although not back to the previous state, but still back to the branching structure, try the next possibility. So, you can think of it as retroactive

    • summary

      The short answer is, because there are many possibilities, try them one at a time. Until, either at some point, the whole match succeeds; Or after they all try, they find that the whole match is not successful.

      1. Greedy quantifier “try” strategy is: buy clothes bargain. The price is too high. Make it cheaper. No, make it cheaper.
      2. The strategy of inert quantifier “try” is to sell things at a higher price. Less, more, okay? A little less, more.
      3. The strategy for branching out is to shop around. Not this one. Go to another one. Not yet.
  • Move 5: To be continued…