introduce

If you are familiar with Linux, you must know the famous Linux three swordmen, they are grep, awk, sed, we are going to talk about the protagonist is sed.

Sed is a stream editor that uses a programmatic approach to editing text, as opposed to vim’s interactive approach. It is very powerful, with regular expression support, can do a lot of complex text editing operations.

In fact, the functionality provided by SED is so complex that there are books devoted to it. This article will not cover all the aspects of SED. It will only explain the working principle and common usage of SED. At the same time, it will also give a lot of practical use cases to help better understand the basic usage of SED. After the knowledge point in the article is really mastered, enough to cope with the basic needs in the work at ordinary times.

It has its own usage scenarios:

  • In an automated program, not suitable for interactive editing;
  • Mass, repetitive editing needs;
  • Edit commands that are too complex to type in an interactive text editor;

The working principle of

As a non-interactive editor, SED uses pre-set editing instructions to edit the input text and output the editing results after completion.

A brief description of sed working principle:

  • Sed reads from the input file, processes it one line at a time, and stores the current line in a temporary buffer calledThe pattern space.
  • The sed command is then used to process the contents of the cache.
  • After processing, the contents of the cache are sent to the screen;
  • And then the next line;

This is repeated until the end of the file, and the contents of the file have not changed, unless you redirect the output or specify the I parameter

Regular expression

Sed is basically a game of regular expression pattern matching, so people who can play sed tend to have good regular expression skills. Regular expressions are relatively extensive. This section will not focus on regular expressions.

In order to be able to let most friends more easily to learn the knowledge of this article, here is a brief introduction to the basic content of regular expressions. If you specialize in regular programming, buy a regular book.

(1) Basic regular expressions

  • ., matches any character except the newline character, similar to the Shell wildcard character?;
  • *, indicating that the preceding characters contain zero or more characters.
  • . *, indicates that any character has 0 or more characters, that is, any character can be matched.
  • ^Represents the beginning of the line, which is the beginning of each line,^abcMatches a string beginning with ABC;
  • $Represents the end of the line, which is the end of each line,} $Matches a string ending in braces;
  • {}, indicates the number range of preceding characters,{2}, means to repeat twice,{2}Repeat at least 2 times,{2, 4}Repeat 2-4 times;
  • [], parentheses can contain expressions that represent character sets

(2) Expand regular expression

Extended regular expressions are not used as often as basic expressions, but they are still important. In many cases, you can’t get away with using extended regular expressions. Sed uses extended regular expressions with the -r option.

  • ?: Indicates that there are 0 or 1 leading characters.
  • +: Indicates that there are one or more leading characters.
  • |: indicates that one of the items is matched.
  • (a): indicates a group.(a|b)bIndicates that the ab or BB substring can be matched, and the command expression can pass\ 1,\ 2To represent matched variables
  • {}: has the same meaning as in braces in the basic re, except that it is used without the escape symbol;

The basic grammar

Here is the basic syntax of sed.

Sed [option] 'command' filename

Options. Common options include -n, -e, -i, -f, and -r.

Command Subcommand format:

[address 1, address 2] [function] [parameter (marker)]

The options are simple:

  • -n, indicates the quiet mode. By default sed prints each line to the screen after it has been processed, and does not print it to the screen after options have been added.
  • -e, if you need to use sed to perform multiple operations on text content, you need to execute multiple sliver commands to operate.
  • -iBy default, sed only processes the contents of the schema space copy and does not modify the file directly. If you need to modify the file, you specify it-iOptions;
  • -fIf there are many command operations, this parameter is used-eYou need to write multiple subcommands into script files and use-fOption to execute the script;
  • -r: If you want to support extended regular expressions, you need to add-rOptions;

Digital and regular addressing

By default sed matches, processes, and outputs each line. Sometimes we don’t need to do everything, but only modify a part of it, such as lines 1-10, even lines, or lines that include a Hello string.

In this case, we need to locate specific lines for processing, not the entire content, and this is called addressing.

(1) digital addressing

Numeric addressing is simply specifying the row to operate on by a number. There are several ways to do this, each of which has different application scenarios.

Just replace hello with A in line 4
$ sed '4s/hello/A/g' file.txt
# Replace hello with A in lines 2-4
$ sed '2, 4 s/hello/A/g' file.txt
# Starting at line 2, count down 4 lines, which is 2-6
$ sed '2,+4s/hello/A/g' file.txt
# Replace hello with A in the last line
$ sed '$s/hello/A/g' file.txt
# Replace hello with A on all lines except line 1
$ sed '1! s/hello/A/g' file.txt
Copy the code

(2) regular addressing

Regular addressing, which uses regular expression matching to determine which lines need to be edited and which other lines need not be edited

# delete the line matching hello, d indicates delete
$ sed '/hello/d' file.txt
# delete blank line, "^$" indicates blank line
$ sed '/^$/d' file.txt
# delete all lines between lines starting with ts and lines starting with te
$ sed '/^ts/,/^te/d' file.txt
Copy the code

(3) The combination of digital address and regular address

Numeric and regular addressing can be used together

Select * from line 1 to ts and delete the matched lines
$ sed '1,/^ts/d' file.txt
Copy the code

Basic subcommand

(1) Replace subcommand s

As a replacement subcommand, s is the most commonly used command by sed. It supports regular expressions and has powerful functions. It can replace the basic usage of grep.

Basic syntax:

[address]s/pat/rep/flags

Replace the basic usage of subcommands

# Replace hello with Hello on each line, replacing only the first one that matches
$ sed 's/hello/HELLO/' file.txt

# replace all matched Hellos with hello, g means replace all matched helLos in a line
$ sed 's/hello/HELLO/g' file.txt

# replace the second match with hello
$ sed 's/hello/A/2' file.txt

Replace all matches after the second attempt
$ sed 's/hello/A/2g' file.txt

# add # to the beginning of the line
$ sed 's/^/#/g' file.txt

# add something at the end of the line
$ sed 's/$/xxx/g' file.txt
Copy the code

Simple use of regular expressions

# using extended regular expressions, the result is: A
$ echo "hello 123 world" | sed -r 's/[a-z]+ [0-9]+ [a-z]+/A/'

# <b>This</b> is what <span style="x">I</span> meant
# Requirement: Remove tags from the above HTML file
$ sed 's/<[^>]*>//g' file.txt
Copy the code

More than one match

# Replace my with your on lines 1-3 and This with That after lines 3
$ sed '1, 3 s/my/your/g; 3,$s/This/That/g' my.txt

# is equivalent to
$ sed -e '1, 3 s/my/your/g' -e '3,$s/This/That/g' my.txt
Copy the code

Use the matched variable

# Place double quotation marks around the matching string, resulting in: My "name" Chopin
# "&" represents the entire result set matched
$ echo "My name chopin" | sed 's/name/"&"/'

# hello=world, "\1" and "\2" indicate the values matched by the parentheses
$ echo "hello,123,world" | sed 's/\([^,]\),.*,\(.*\)/\1=\2/'
Copy the code

A few other common uses

-n turns off print mode in pattern space
$ sed -n 's/i/A/p' file.txt

# substitution ignores case and replaces case I with case A
$ sed -n 's/i/A/i' file.txt

Save the replacement as a file
$ sed -n 's/i/A/w b.txt' file.txt
$ sed -n 's/i/A/' file.txt > b.txt
Copy the code

Note that the default behavior of sed is to directly output the modified schema space without saving it to the original file. To modify the original file, you need to specify the -i option.

(2) Append line subcommand a

The subcommand a inserts the specified content line below the specified line.

# add A line A below all lines
$ sed 'a A' file.txt

Add A line A to each line 1-2 in the file
$ sed '1, 2 a, a' file.txt
Copy the code

Insert subcommand I

The subcommand I uses much the same as a, except that it inserts the specified content line above the specified line

# Add A line A above each line 1-2 in the file
$ sed '1, 2, A I'
Copy the code

(4) Replace the subcommand c

The subcommand c is used to replace the specified line with the required line

Replace all lines of the file with A, respectively
$ sed 'c A' file.txt

# replace lines 1-2 in the file with A, note that two lines become one line A
$ sed 'A' 1, 2 c file.txt

# replace lines 1-2 with line A, respectively
$ sed '1, 2 A, c \ nA' file.txt
Copy the code

(5) Delete command d

The d subcommand deletes the specified line of content, which is understandable

# delete lines 1-3 from the file
$ sed '1, 3 d' file.txt

# Remove the line starting with This from the file
$ sed '/^This/d' file.txt
Copy the code

Set line number subcommand =

The subcommand = prints the line number

# displays the line number above the specified line
$ sed '1, 2 =' file.txt

You can set the line number to the beginning of the line
$ sed '=' file.txt | sed 'N; s/\n/\t/'
Copy the code

(7) Subcommand N

Subcommand N, put the next line into the cache for matching, note that the first line \ N is still preserved

In fact, the next line of the current row is also read into the cache, matched and modified together, for example

# merge even lines into odd lines
$ sed 'N; s/\n//' file.txt
Copy the code

Ha ha, isn’t that easy?

Actual practice

After mastering the above basic command operation, basically can meet the usual 95% requirements. Sed also has some advanced concepts, such as: pattern space, hold space, advanced subcommands, branches and tests, etc., which are rarely used in normal times. This article will not explain them for the time being.

After learning so many basic usages, as long as you practice, practice, and use more, you will be able to greatly improve the efficiency of text processing. Below I simply give some more practical operation practice, I hope to help you.

1. Delete the second character of each line of the file

$ sed -r 's/(.) (.). (. *) $1 / \ \ 3 / ' file.txt
Copy the code

2. Swap the first and second characters of each line

$sed - r 's/(.). (.). (. *) / \ \ \ 1 2 3 / 'file. TXTCopy the code

3. Delete all digits in the file

$ sed 's/[0-9]//g' file.txt
Copy the code

4. Replace all Spaces in the file with tabs

$ sed -r 's/ +/\t/g' file.txt
Copy the code

5. Enclose all capital letters in brackets **()**

$ sed -r 's/([A-Z])/(\1)/g'
Copy the code

6. Delete interlaced lines

$ sed '0~2{d}' file.txt
Copy the code

7. Delete all blank lines

$ sed '/^$/d' file.txt
Copy the code

Well, that’s all you need to use with the sed command. Want to master, only more practice, more practice of regular expression use, once mastered, I believe in the future work will have a great role.

Thank you. I’m Chopin. Stay tuned for more.

Recommended reading:

  • A letter for Linux beginners
  • The most detailed load balancing principle diagram in the whole network
  • Sed tutorial details, xiao Bai can understand