Make writing a habit together! This is my first day to participate in the “Gold Digging Day New Plan · April More text challenge”, click to see the details of the activity.

Pipe operators:|

We often want to use under Linux pipeline operators, namely “|”, that is, a vertical bar. The effect of this operator should look straightforward to anyone who uses Linux regularly:

Isn’t it just passing the result of the previous instruction to the later instruction?

Here’s an example:

cat system.log | grep hello
Copy the code

The above instructions allow us to filter out the system.log file containing the Hello string and print it to the screen.

What do you meanResult of instruction

The problem is, there’s a lot of confusion about what the result of an instruction is.

Is it the log output of this instruction? Is it the return code of this instruction? (Some of you may not know this) or something?

In fact, the concept of the result of an instruction is not strictly defined, so when we talk about the result of an instruction, it is often related to business scenarios. It is logical that the result of an image processing program, for example, should be an image.

How do I describe the behavior of the CAT instruction

First let’s look at the behavior of the cat system.log directive.

That is, the contents of system.log are printed to the screen.

This process can be explained in detail, as printing content to the screen is still not actually accurate enough.

This is where the concept of files under Linux comes in.

The program opens the file and the file descriptor

In Linux, it is easy to understand that a program can open a file.

So how does a process (normally called a process in a running program) manage files?

That’s something like a file descriptor, which is a number and a bunch of data associated with it.

When a process opens a file, it creates a new file descriptor, and this number is usually self-incrementing. For example, the current file descriptor is 100. So open a new file, and the new file’s descriptor is 101. Of course, this descriptor can be reused if the process closes the file. I won’t bother here.

If the concept were graphically represented, it would look something like this:

We can see that different numbers represent different files.

So where can I list the files that a process has open?

Under Linux, you can first find the PID of the process, let’s say 20000

Go to the following directory:

cd /proc/20000/fd
Copy the code

This directory holds the files opened by the 20000 process. Ls, as shown in the figure:

As I described, it’s all numbers.

Some conventional descriptors

From the figure above, we can see that the descriptor starts at 0. So what is 0?

On Linux, each process starts with three files open by default, using 0,1, and 2 as their descriptors.

So what are the three files 0, 1, and 2 represent?

  • 0: indicates standard input
  • 1: indicates standard output
  • 2: indicates error output

Maybe the description above is not clear enough.

And then, in general,

  • Standard input, that meansKeyboard input
  • Standard output, that meansOutput to screen
  • Error output, also refers toOutput to screen

The role of file descriptors

If you’ve ever written a program that reads keyboard input under Linux, this is how it works.

What you’re reading is the contents of the file represented by the 0 descriptor: the keyboard.

If you’ve ever used a function like print to print a log, you’re actually printing out the contents of the file represented by the 1 descriptor, and that’s what’s displayed on the screen.

In fact, this read and write operation is the same for all files, at least in terms of usage.

You open a file, give it a descriptor, and then read or write to that descriptor.

The relationship between files and keyboard input

The keyboard input operation, in Linux, becomes a file operation, just like a normal file. That’s the design of Linux, everything is a file. It’s not just keyboard manipulation and on-screen printing that are file manipulation. The network connection reads data and so forth, which are file operations, which also produce file descriptors. There is a limit to how many file descriptors a process can own at the same time, and this limit can be set. If you don’t believe me, you can try opening 10,000 files at once and see if you get an error, and if you get an error, see what it is.

It could be something like this:

too many opened files
Copy the code

Back to the pipe operator|

cat system.log | grep hello
Copy the code

Let’s describe the whole instruction again:

1. cat system.log

The cat program opens the system.log file and writes the contents of system.log to file 1, which is the screen.

2. grep hello

Grep, tries to read data from file 0, finds the line that contains Hello, and writes that line to file 1, the screen

3. The role of pipes

That’s binding the 1 of the previous program to the 0 of the next program.

Otherwise, the cat result would be printed to the screen instead of grep.