Interpreters and compilers

  • The interpreterAn interpreter is a program that directly executes instructions written in a programming language. Interpreted programs always require an interpreter to run
  • The compilerThe compiler is required to translate our source code into machine-readable binaries

Example: Here is a piece of Python code

print("hello world\n")
Copy the code

As long as we have the Python environment installed, we can execute it, and that’s what the interpreter does.

Also the following is a section of C language code:

#include<stdio.h>

int main(int argc,char * argv[]){
    printf("hello world\n");
    return 0;
}
Copy the code

If we need to execute, we need to compile first:

After compiling, a binary a.out is generated before it can be run. We can do it at/usr/binFind it in directoryclangandpythonCompilers and interpreters

Bottom line: The interpreter is really just explaining and running, and the compiler needs to package all of our code into an executable binary.

LLVM

Summary: LLVM is a framework system of architecture compilers, written in C++, used to optimize compile-time, link-time, run-time, and idle-time of programs written in any programming language. Keep it open to developers and compatible with existing scripts.

The architecture of a traditional compiler

  • Frontend(compile Frontend)Parse the source code. It conducts lexical analysis and grammar analysis to check whether the source code has Syntax errors, and then constructs Abstract Syntax Tree.LLVM front-end also generates intermedidate representation IR.
  • The Optimizer (Optimizer): Responsible for various code optimizations. Improve code run time
  • Backend (Backend): maps code to the target instruction set. Generate machine language and perform machine-specific code optimizations.

IOS Compilation Architecture

Objective-c /C/C++ front-end compiler is clang,swift front-end compiler is Swift. The back end is all LLVM.

The design of the LLVM

  • LLVM comes into play when the compiler decides to support multiple languages and hardware architectures. Other compilers such as GCC have also been very successful, but as a whole application design, there are many limitations.
  • The most important aspect of LLVM is that it uses a common code representation (IR) so that the front end can be written for any programming language and the back end can be written for any hardware architecture

The compilation process

  • Let’s create a.m file and run the following command to view and compile the process

clang -ccc-print-phases main.m

  • 0: input file: find the source file
  • 1: pre-processing stage: this process includes macro replacement and import of header files.
  • 2: Compilation stage: conduct lexical analysis, grammar analysis and check whether the grammar is correct. And you end up with IR.
  • 3: back end: Here LLVM is optimized one pass at a time (node), and each pass does something to generate assembly code
  • 4: Generates the target file
  • 5: Link: link the required dynamic library and static library to generate the class execution file
  • 6: Generate corresponding class execution files through different architectures

precompiled

#include <stdio.h>
#define kMachoC     2
int main(int argc, const char * argv[]) {
    int a= 1;
    int b = 3;
    printf("%ld",a+b+kMachoC);
    return 0;
}
Copy the code

clang -E main.m

You can see the import of the header file and the macro replacement

Compilation phase

Lexical analysis

clang -fmodules -fsyntax-only -Xclang -dump-tokens main.m

After the precompilation process is completeLexical analysis, where the code is sliced into tokens, such as parentheses, strings, etc

Syntax analysis

clang -fmodules -fsyntax-only -Xclang -ast-dump main.m

Lexical analysis is followed by grammatical analysis, whose task is to verify the correctness of grammar and combine word sequences into various grammatical phrases on the basis of lexical analysis. Such as “program”, “statement”, “expression” and so on, and then all the nodes into an abstract syntax tree. Then analyze whether the syntax is correct.

Generate intermediate code

Clang -s -fobjc-arc-emma-llvm main.m Run the above command to generate the.ll file,

The LLVM IR grammar

IR optimization

The LLVM optimization level is o0-O1-O2-O3-OS

Clang-os-s -fobjc-arc-emia-llvm main.m -o main.ll

bitCode

Clang-emit -llvm -c main.ll -o main. BC xcode7, apple will further optimize after enabling Bitcode to generate.bc intermediate code.

Generating assembly code

Generate assembly code from the final.bc or.ll

clang -S -fobjc-arc main.bc -o main.s

clang -S -fobjc-arc main.ll -o main.s

Assembly optimization:clang -Os -S -fobjc-arc main.m -o main.s

Generate object file

clang -fmodules -c main.s -o main.o

View the target file: xcrun nm-nm main.o

external: Indicates an accessible external file

Generate lesson execution files (link)

clang main.o -o main

See the linked file: xcrun nm-nm main

_dyld_stub_binderRepresents the symbol that needs to be rebound, i.eprintffunction