Young boy, I see your skeleton is exquisite, it is a once in a century martial arts talent, you and I are predestined people, with the teacher to cultivate

  • Regular formula 18 – type 1: straight huanglong
  • Regular 18 – Type 2: control crane to capture the dragon
  • Regular formula 18 – Formula 3: The dragon jumps in the deep
  • Regular formula 18 – Type 4 ash

Then: regular 18 type – the first type: straight huanglong

Fan fairy :" Tu son, how did you get enlightenment?" Jet :" Half forgotten." Half an hour later... Hide fan fairy :" now?" Jet: I've forgotten all about it. Hide fan fairy :" very good, now for the teacher to teach you: regular 18 - type 2: control crane capture dragon"Copy the code

First, squeeze law

1. Greedy mode (default)
Hide fan fairy: now request, match to the following title. Jet:.* will do? You can try. Jet: How did that happen? Seems to default to as many matches as possible. This is the greedy mode.Copy the code


2. Non-greedy mode
Jet: What's the solution, Master? Zang fan fairy: the question mark is not greedCopy the code

Jet: Master, this wave of stability, using two qualifiers to trap hidden fan fairy: hence the name of this move - controlling crane and capturing dragonCopy the code


Re in Java

Zang Fan fairy: This is one of the magic weapons for teachers,Java processing machine. Jett: I've heard that master uses Java code to control the Regis like flowing water. Zang Fan xian: not only Java, every language and re can not be separated from the relationship, otherwise it is too bad. Now you can extract information.Copy the code
The regex ├ ─ ─ the SRC │ ├ ─ ─ the main │ │ ├ ─ ─ Java │ │ │ └ ─ ─ com │ │ │ └ ─ ─ toly1994 │ │ │ └ ─ ─ the regex │ │ │ └ ─ ─ Parser. Java │ │ └ ─ ─ Resources │ │ └ ─ ─ regx │ │ ├ ─ ─ book. TXT │ │ └ ─ ─ regx. TXTCopy the code

1. Character stream reads files

For ease of use, read the file to parse, and since we’re using a string, just read it with FileReader

/** * Author: Zhang Fengjietele Time: 2019-10-24 Email: [email protected] * description: Public class Parser {public static void main(String[] args) {String dir = system.getProperty ("user.dir"); File file = new File(dir,"regex/src/main/resources/regx/book.txt"); System.out.println(readFile(file)); Private static String readFile(File File) {StringBuilder sb= new StringBuilder(); try(FileReader fr= new FileReader(file) ){ char[] buff = new char[1024]; int len = 0; while ((len = fr.read(buff)) ! = -1) { sb.append(new String(buff, 0, len)); } } catch ( IOException |RuntimeException e) { e.printStackTrace(); } return sb.toString(); }}Copy the code

2. Filter the escapement by matching characters

The yellow matches above can be obtained by code. Java is done through the Pattern and Matcher classes

This makes it easy to get all the titles in the file, whether it’s hundreds of thousands of lines or millions of lines

Private static void regexBook(String target) {String regex=".*? "; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(target); while (matcher.find()){ System.out.println(matcher.group()); }}Copy the code

Three, to control the crane to catch the dragon

1. The template is matched

Now define a syntax template ${} to extract all the required information

Jet: That’s not easy. Look at me. – Fight the dragon

private static void regexWidget(String target) { String regex="\\$\\{.*? } "; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(target); while (matcher.find()){ System.out.println(matcher.group()); }}Copy the code


2. Web page matching

Jett: This is interesting. The most important thing is to analyze the structure and force them. (This lazy loading slide very tired, slide out more than 4000, enough)


private static void regexHtml(String target) { String regex="username\">.*? < "; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(target); while (matcher.find()){ System.out.println(matcher.group()); }}Copy the code

Hide fan fairy: this move control crane capture dragon, you may want to practice more, when extracting information is very useful. Jet: I feel the same way. It's very useful, especially for fixed strings like web pages. What's your next move? You practice first and have a rest for the teacher. The next move, regular 18 - type 3: dragon leap in the deep.Copy the code

Afterword.

1—- This article was originally written by Zhang Fengjie, please note when reprinted

2—- If there is anything you want to communicate, please leave a comment. Also can add wechat :zdl1994328 3—- personal ability is limited, if there is not right welcome everyone to criticize the evidence, must be modestly correct 4—- see here, I here thank you for your love and support, scan code concern – the king of programming