preface

At the beginning of the introduction of two articles, one is the god bestsWifter “Self-cultivation of programmers” reading summary

There is also a very good article I feel, the author teaches us to treat the attitude of learning why know Live, points answer, get are not dry goods? – 20171123.

In fact, what I want to record below is also the contents of the book “Programmer self-cultivation”, here to record

To the chase

C’s classic “Hello, World” program is something that almost any programmer can write with his eyes closed, compile and run at a stroke.

#include <stdio.h>

int main()
{
    printf("Hello world\n")
    return 0;
}
Copy the code

On Linux, when we compile the Hello World program using GCC, we use only the simplest command (assuming the source file is called hello.c)

$gcc hello.c
$./a.out
Hello World
Copy the code

In fact, this process can be broken down into 4 steps, namely Prepressing, Compilation, Assembly and Linking.

  • GCC compilation process breakdown diagram

precompiled

C and related header files such as stdi.h are precompiled by the precompiler CPP into a.i file. For C++ programs, the source file might have a.cpp or.cxx extension, the header might have a.hpp extension, and the precompiled file might have a.ii extension. The first step of precompiling is equivalent to the following command (-e indicates precompiling only)Copy the code
$gcc -E hello.c -o hello.i
Copy the code

or

$cpp hello.c > hello.i
Copy the code

The precompilation process mainly deals with precompiled instructions that start with “#” in source code, such as “#include” and “#define”. The main processing rules are as follows:

Remove all “#define” and expand all macro definitions

Handle all conditional precompiled instructions such as “#if” “#ifdef” “#elif” “#endif” “#else”

Process the “#include” precompiled directive to insert the included file at the location of the precompiled directive. Note that this process is recursive, meaning that the included file may also contain other files.

Delete all comments

Add line numbers and file name identifiers, such as #2 “hello.c” 2, so that the compile-time compiler can generate line number information for debugging and display line numbers for compile-time compilation errors or warnings

Keep all #pragma compiler directives because the compiler needs to use them

The compiled.i file does not contain any macro definitions because all macros have been expanded and the included files have been inserted into the.i file. So when we can’t tell if the macro definition is correct or the header file contains correct, we can look at the compiled file to determine the problem.

compile

Compilation process is to preprocess the file for a series of lexical analysis, syntax analysis, semantic analysis and optimization to generate the corresponding assembly code file. The above compilation process is equivalent to the following command

$gcc -S hello.i -o hello.s
Copy the code

When the assembly is complete, a.s file is generated. For C code, the program for the precompilation and compilation process is CC1, for C++, there is a corresponding program called CC1plus, and for Object-C it is CC1obj. In fact GCC is just a wrapper around the daemon, calling the precompiled cc1, assembler AS, and connector Id according to different parameter requirements

assembly

An assembler converts assembly code into commands that can be executed by a machine, with almost one machine instruction for each line of assembly statements. Therefore, the assembly process of the assembler is relatively simple compared with the compiler. It has no complex syntax, no semantics, and no instruction optimization. It just translates one by one according to the comparison table between the assembly instruction and the machine instruction. The assembly process can be completed by calling the assembler as:

$as hello.s -o hello.o
Copy the code

or

$gcc -c hello.s -o hello.o
Copy the code

Or use GCC command from C source File, after precompilation, compilation, assembly directly output Object File:

$gcc -c hello.c -o hello.o
Copy the code

link

Linking is often a rather convoluting process. Why doesn’t the assembler just print an executable instead of an object file? What exactly does the linking process involve? Why link? Call Id directly to run the Hello World program:

#ld -static /usr/lib/crt1.o /usr/lib/crti.oO -l /usr/lib/gcc/i468-linux-gnu/4.1.3/ crtbegin. o -l /usr/lib/gcc/i468-linux-gun/4.1.3 -l /usr/lib-l /lib hello --start-group -lgcc-lgcc_eh-1c --end-group /usr/lib/gcc/i468-linux-gun/4.1.3/crtend. O /usr/lib/crtn.oCopy the code

If you omit all of the paths above, that is

ld -static crt1.o crti.o crtbeginT.o hello.o -start-group -lgcc -lgcc -lgcc_eh -1c-end-group crtend.o crtn.o
Copy the code

As you can see, we link a bunch of files together to get “A.out “, which is the final executable. O crt1.o crti. O crtbegin.o crtend. O crtn.o These files are what, what effect, -lgcc-lgCC_eh-lc these are what parameters, Why do you want to link them to hello. O to get executable files? I don’t know. Haha… That’s why you have to learn. But the answer can be found in the book “Self-Cultivation of programmers”.

Self-cultivation of programmers this book is very worth reading, before just roughly read some, now forget almost, take advantage of the opportunity to read again from a book, and some important places recorded, convenient to deepen memory and understanding. Confucius said well, review the old, you will know the new, read once may only have an impression, both sides are to remember, three times may be able to understand the real meaning. Certainly does not exclude some great god, see again ok.