On the front end, regular expressions can be used for more than just “phone verification” and “email verification”. I have collected and sorted out some usage of regular expression, and want to publish my understanding and experience of regular expression with examples.

Split Windonw.location. query and convert it to a JSON object

For example, the path xxxxx.com? Name =A&abc= &ID =123&ticket= fXXXX&time =2021-03-04&serveicename=sys You can parse the window.location.query string into a JSON object using the following methods. This case focuses on the concept of grouping.

/ / simulation window. The location. The query
let query = "? name=A&abc=&id=123&ticket=fxxxx&time=2021-03-04&serveicename=sys"
function resolveToObject(query) {
    const reg = / / ^ = &? +) = (/ ^ = &? *)/g;
    let group = reg.exec(query);
    const object = {};
    while (group) {
        object[group[1]] = group[2];
        group = reg.exec(query);
    }
    return object;
}
console.log(resolveToObject(query));
/** where window.location.query is "? name=A&id=123&abc=&ticket=fxxxx&time=2021-03-04&serveicename=sys"; So we get output {name: 'A', id: '123', ABC: ', ticket: 'FXXXX ', time: '2021-03-04', serveicename: 'sys'} **/
Copy the code

Resolution:

  1. Translate the regular expression /([^=&? +) = (/ ^ = &? *)/g is: use global mode, match(One or more non =&? The character)=(zero or more non =&? The character of);

  2. The exec method uses the re to match the target string. If a match is found, it returns an array (group), otherwise null. Where group[0] is the content matched by the whole re, and group[n] is the content matched by the NTH (). Since the re is in global mode in the case, each exec re remembers that each match was made at the end of the target string, and the exec re will start the match one bit after the last match. See if the following helps you understand global and grouping
Reg. Exec (query): what is the content of the match? Name =A&id=123&abc=&ticket=fxxxx&time=2021-03-04&serveicename=sys" group = ['name=A','name','A'] exec(query): Match with "&id=123&abc=&ticket=fxxxx&time=2021-03-04&serveicename=sys" and get group value ['id=123','id','123'] Reg. Exec (query): "&abc=&ticket=fxxxx&time=2021-03-04&serveicename=sys"; Note: If /([^=&?] +) = (/ ^ = &? *)/g, will be an infinite loop, each loop used to match the content does not changeCopy the code

Java equivalent:

import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        String query = "? name=A&abc=&id=123&ticket=fxxxx&time=2021-03-04&serveicename=sys";
        Pattern reg = Pattern.compile("(/ ^ = &? +) = (/ ^ = &? *)");
        Matcher m = reg.matcher(query);
        Map<String,String> ret = new HashMap<String,String>();
        while (m.find()) {
            System.out.println(m.group());
            // m.group() is equivalent to js group[0]
            ret.put(m.group(1),m.group(2)); } System.out.println(ret); }}Copy the code

Query a parameter value from the query string

You can use the following function

/ / simulation window. The location. The query
let query = "? name=A&abc=&id=123&ticket=fxxxx&time=2021-03-04&serveicename=sys"
function getValueByParamName(name, query) {
  const reg = new RegExp(` (? : ^ | [&]?${name}= (/ ^ = &? *) `);
  const group = reg.exec(query);
  if (group) {
    return group[1];
  }
  return null;
}
console.log(getValueByParamName("name", query));
/** the output is A **/
Copy the code

Resolution:

  1. If name is’ name ‘, the reg object in the code is equivalent to /(? :^|[?&])name=([^=&?] *) /
  2. Among them,? Group [1] is the second () match in the group object that is returned after exec. If not used (? :); return group[2]

Java equivalent

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        String query = "? name=A&abc=&id=123&ticket=fxxxx&time=2021-03-04&serveicename=sys";
        System.out.println(getValueByParamName("name", query));
    }

    private static String getValueByParamName(String name,String query) {
       Pattern reg = Pattern.compile("(? : ^ |? [*])" + name + "= (/ ^ = &? *)");
       Matcher m = reg.matcher(query);
       if (m.find()) {
           return m.group(1);
       }
       return null; }}Copy the code

The password must contain at least six digits, uppercase letters, and lowercase letters

In this case, prematching is used

function validate(str) {
    let reg = / ^ (? =.*\d)(? =.*[A-Z])(? =.*[a-z]).{6,}$/;
    return reg.test(str);
}
console.log(validate("123456"));
console.log(validate("123A456"));
console.log(validate("123Ab4*6"));
Copy the code

Resolution:

  1. This re can be divided into five parts, namely ^,(? =.\ d), (? =.[a-z]), (? =.[a-z]),.{6,}$, assuming STR is “123Ab6 “, the matching process can be understood as follows:
^ indicates the beginning of the string, (? =.*\d) matches from the current matching position (at the beginning of the string).*\d matches to "123Ab*6", followed by (? =.*[a-z]) from the current matching position (again at the beginning of the string, because prematching does not change the current matching position) matches "123A", and the last one does the same. So all three prematches pass. Start matching.{6,}$, that is, matching "any six or more characters up to the end of the string". Thus, the verification "must contain digits, uppercase letters, lowercase letters at least six" is finally achievedCopy the code

The Java equivalent is

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        System.out.println(validate("123456"));
        System.out.println(validate("123A456"));
        System.out.println(validate("123Ab4*6"));
    }

    private static boolean validate(String pwd) {
       Pattern reg = Pattern.compile("^ (? =.*\\d)(? =.*[A-Z])(? =.*[a-z]).{6,}$");
       Matcher m = reg.matcher(pwd);
       returnm.matches(); }}Copy the code

4. Add thousands separator to digits.

For example, change the number 1234567890 to 1,234,567,890. The case is mainly a combination of grouping and pre-matching

   let num = "1234567890";
   let ret = num.replace(/ ((\ d {1, 3})? =(\d{3})+$)/g.function(g,g1){
       return `${g1}, `
   });
   console.log(ret);
   // Prints 1,234,567,890
Copy the code

Resolution:

  1. “Prematch” is (? =).
  2. The re can be split into two parts :(\d{1,3}) and (? = (\ d {3}) + $). The first () match is the value of the G1 parameter in fucntion, which can be printed out for easy comprehension. The second pre-matched re (\d{3})+$means “one or more combinations of three digits up to the end of the string”.

The implementation process of Repalce can be understood as follows:

First loop: match "1234567890" (? = (\ d {3}) + $) to match, match to the "234567890", so (\ d {1, 3}) can only be matched to the "1", the function of g1 is 1, return "1," second cycle: match the content of "234567890" (? Function (\d{1,3}) = function (\d{1,3}) = function (\d{1,3}) = function (\d{1,3}) +$) NTH loop: matches "890" (? \d{1,3} =(\d{3})+$) =(\d{1,3})Copy the code

Num.replace (/(\d{1,3})(? =(\d{3})+$)/g,”$1,”); Where $1 is equivalent to g1, or $2,$3, etc., if there are multiple parentheses

Java is written as

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        System.out.println(format("1234567890"));
    }

    private static String format(String pwd) {
       Pattern reg = Pattern.compile("(\ \ d {1, 3}) (? =(\\d{3})+$)");
       Matcher m = reg.matcher(pwd);
       return m.replaceAll("$1"); }}Copy the code

Parse XML HTLM format strings

In this case, the main re is used with the stack, using the concept of “backtracking” in the re

function compile(template) {
    const tagStack = [];
    const all = [];
    <([a-za-z]+)([^<>]*?) (\ /? ([^ >, < >] +) and < \ / ([a zA - Z] +) 2. The three parts by | separated (that is three parts is' or 'relationship) 3. 1) <([a-za-z]+)([^<>]*?) (\ /? 
      
or where "*?" Meaning "non-greedy match" (as few matches as possible), as opposed to "*" for greedy match (as many matches as possible). + 2) ([^ < >]?) The purpose is to match the text part in the middle of
text
, without the beginning and end tags, with parentheses to grab the text content. <\/([a-za-z]+)
const startReg = /<([a-zA-Z]+)([^<>]*)(\/?) >|([^<>]+)|<\/([a-zA-Z]+)>/g; let ret = startReg.exec(template); let obj; while (ret) { // If ret[1] has a value, this is the beginning of a label. Ret [1] is the name of the matched label if (ret[1]) { obj = { tag: ret[1].// ret[2] is the part of the matched attribute string that is parsed as an object using resolveAttrs attrs: resolveAttrs(ret[2]), children: [],};// If there is a previously matched label in the stack, the current label is a child of the last label in the stack if (tagStack[tagStack.length - 1]) { tagStack[tagStack.length - 1].children.push(obj); } else if(ret[3]) {// If the current stack is empty and the label is closed all.push(obj); } If ret[3] is' /> 'and ret[3] is empty, the label is not finished if(! ret[3]) { tagStack.push(obj); }}// ret[4] is the matched text content in the tag, and a text node is stored in the last element on the stack else if (ret[4] && ret[4].replace(/\s/g."")) { tagStack[tagStack.length - 1].children.push({ tag: "".text: ret[4]}); }// if ret[5] is not empty, the current label has ended else if (ret[5]) { obj = tagStack.pop(); // If the stack is empty, the current label has no parent label if (tagStack.length === 0) { all.push(obj); } } ret = startReg.exec(template); } return all; } // Used to parse attribute values function resolveAttrs(attrString) { const reg = /([^<>=\s'"]+)(=(['"])([^"<>=]*)\3)? /g; let ret = reg.exec(attrString); const map = {}; while (ret) { map[ret[1]] = ret[4]; ret = reg.exec(attrString); } return map; } Copy the code

parsing

  1. Let’s first parse what resolveAttrs does
1) You can see that thefunctionThere are four regees usedgroup, where the 3rd and 4th are included in the 2ndgroupIn the. 2) where '\3' is the concept of backtracking. For example, the tag <person name= 'A' sex= ""f" active/ >, passresolveAttrsThe input parameter of is"name= 'A' sex="f" active", 1st timeexecWhen the firstgroupMatch thename(ret[1] is the name), the thirdgroupA single quote is matched, so \3 is also a single quote (\3According to the first3Group matched content, so the first4Groups are matched"A"(ret[4[A]. The first2When secondary exec, ret[1] for sex, \3Is double quotes,ret[4] is f. The first3When secondary exec, ret[1] is active2None of the groups matches anything, so the final result is {name:"A", sex: "f", active: undefined}
Copy the code
  1. You can imagine a few strings in XML format and explore the compile process with comments

The canonical solution of the longest substring of a loop

This is a problem in Leetcode, a solution I figured out by myself, in essence, also belongs to the “center diffusion method”, just use the regular expression to find the diffusion center

var longestPalindrome = function(s{
    // The palindrome itself returns itself
    if (isPalindrome(s)) {
        return s;
    }
    // Is itself a non-palindrome, a string of 2 characters, returning the first character
    if (s.length === 2) {
        return s[0];
    }
    // Start looking for the diffusion center of ABA pattern here
    let reg = /(\w)(? =(\w\1))/g;
    let ret = reg.exec(s);
    let starts = [];
    while (ret) {
        let len = ret[2].length + 1;
        let start = ret.index;
        starts.push({
            start,
            len
        });
        ret = reg.exec(s);
    }
    // Start looking for the diffusion center of AA mode
    reg = /(\w)(? =(\1))/g;
    ret = reg.exec(s);
    while (ret) {
        let len = ret[2].length + 1;
        let start = ret.index;
        starts.push({
            start,
            len
        });
        ret = reg.exec(s);
    }
    let max = s[0];
    starts.forEach(pair= > {
        let { start,len } = pair;
        let _max = s.substring(start, start + len);
        while (true) {
            start--;
            if (start < 0) {
                break;
            }
            len += 2;
            let target = s.substring(start, start + len);
            // Start to spread, start to find the longest palindrome string
            if (s[start] === s[start+len-1]) {
                if(_max.length < target.length) { _max = target; }}else {
                break; }}if(max.length < _max.length) { max = _max; }});return max;
};
function isPalindrome(s{
    let _s = [...s];
    _s.reverse();
    return _s.join(' ') === s;
}
Copy the code