Together with friends, using the Python for text processing, to look at first to deal with the text, the file named “. TXT “, the inside of the file content is Chinese and English text in three lines, and two empty lines, we want to realize the function is extracted respectively from the three lines of text in English and their corresponding Chinese, this feature, combined with the crawler, You can make your own exclusive English dictionary, is not very cool, and very practical, and so on what, quickly on the code!

This article is going to take care of the basics, so start with the simplest building block and run the code first to see how it works

Looks like it’s just printing out the text, you’re right, it’s as simple as that, we’re looking for a simple beginning, a simple middle, and a simple end 🙂

The first is open(" data.txt "), which tells Python to open a file named "data.txt". What do I do when I open it? Through the for read inside the file line by line, some children shoes may contact the grammar for the first time, feel do not understand, that it doesn't matter, is not really don't understand, but need to adapt to the new knowledge, more than a few times of code, every day to knock again, within a week, will feel abnormal kind, not letter can try, spend 3 minutes a day, That's 21 minutes a week. It really works.

And then print(line), which is the number of lines that you read, even if it’s blank, and print(line), which is the number of lines that you read.

In front of the 3 knowledge points to master, children’s shoes are already an entry! And then we have V2, but let’s see

Some children’s shoes can not help Shouting: messy code! Don’t panic, dear, these square brackets are not gibberish, they are syntax. In Python, a pair of square brackets [] indicates a list. Yes, a list is sold by one. Oh, not sold, but used 🙂

A list can be empty, such as the two empty lists in the figure above, or it can contain multiple elements, such as the other three lists in the figure above, each containing three elements. In this case, each element is a string, and a pair of single quotes denotes the beginning and end of a string. Some children will ask if double quotation marks are OK, this can have 🙂

Within the same List, elements are separated by commas.

The output result we have understood, the next step is to look at the source code

The V2 version added a new line, new_line, which is the result of two operations on line, namely the list containing the elements we saw earlier.

So what did we do to line?

To illustrate and workers of the line, we need to define a set of variables, respectively is an empty string (variable named emptyString), a character (strA), a character suffix a space (strAWithTrailingWhitespace), Rstrip (); String (strArstriped); rstrip(); Rstrip () is not a strip(), but a strip() is not a strip().

Let’s go through them one by one, starting with an empty string

Empty string (len(emptyString)) = 0; Empty string (len(emptyString) = 0; Empty string (len(emptyString)) = 0; Empty string (len(emptyString)) = 0; Empty string (len(emptyString)) = 0; Empty string (len(emptyString)) = 0; Empty string (len(emptyString)) = 0;

Then we have a string containing one character (strA), so the length (len(strA)) is 1, and then we have print the string with a “.”

With the front of the mat, the knowledge of the blackboard to come! The variable we will define below is a Trailing character with a space

(Whitespace) variable name is full of this :), so the length is 2, note that when print, “a” and “.” There is an obvious space between, and can be selected with the mouse Oh

This section is a variable name strArstriped finally, i.e., to the operation of the strAWithTrailingWhitespace variables into line trailing Spaces for new variable, so the length change back again 1, please pay attention to the print, “a” and “.” The space between is removed by rstrip

To make it easier for children to understand, use two lists to store the raw text without the RStrip and the processed text respectively, as shown in the figure below

Then there is the split() method, which is split, which splits a line of text into small segments, with the default delimiter being whitespace, and the null string is removed from the result. Results the following

Finally, the final version, listE stores English, listC stores the corresponding Chinese

The real entry, and then is the continuous practice, continuous consolidation, for the work after the solid foundation

