A library that makes Python 100 times faster, you know?

Python has long been accused of being too slow, but it’s actually not that slow. What’s slow is that Cpython, the interpreter used by Python, is very inefficient.

“One line of code makes Python run 100 times faster” is no claptrap.

So let’s look at the simplest case, where we go from 1 to 100 million.

The original code:

import time
def foo(x,y) :
        tt = time.time()
        s = 0
        for i in range(x,y):
                s += i
        print('Time used: {} sec'.format(time.time()-tt))
        return s

print(foo(1.100000000))
Copy the code

Results:

Time used: 6.779874801635742 sec
4999999950000000
Copy the code

Let’s add a line of code and see what happens:

from numba import jit
import time
@jit
def foo(x,y) :
        tt = time.time()
        s = 0
        for i in range(x,y):
                s += i
        print('Time used: {} sec'.format(time.time()-tt))
        return s
print(foo(1.100000000))
Copy the code

Results:

Time used: 0.04680037498474121 sec
4999999950000000
Copy the code

Is it more than 100 times faster?

Why is numba’s JIT module so good?

After leaving Enthought, NumPy founder Travis Oliphant founded CONTINUUM, which focuses on big data processing in Python. The recently released Numba project can jit-compile Python functions that handle NumPy arrays into machine code execution, increasing the speed of programs a hundredfold.

Detailed installation steps for Linux are available on the Numba project home page. Compiling LLVM takes some time. Unofficial Windows Binaries for Python Extension Packages allow you to download and install LLVMPy, Meta and NUMba.

Here’s an example:

import numba as nb
from numba import jit

@jit('f8(f8[:])')
def sum1d(array) :
    s = 0.0
    n = array.shape[0]
    for i in range(n):
        s += array[i]
    return s

import numpy as np
array = np.random.random(10000)
%timeit sum1d(array)
%timeit np.sum(array)
%timeit sum(array)
10000 loops, best of 3: 38.9 us per loop
10000 loops, best of 3: 32.3 us per loop
100 loops, best of 3: 12.4 ms per loop
Copy the code

Numba provides modifiers that jit-compile the functions they modify into machine code functions and return a wrapper object that can be called in Python in machine code. In order to compile Python functions into machine code that can be executed at high speed, we need to tell the JIT compiler the various parameters and return value types of the function.

We can specify the type information in a number of ways. In the above example, the type information is specified by a string ‘f8(f8[:])’. Where ‘f8’ represents an 8-byte double precision floating-point number, ‘f8’ before the parentheses indicates the return value type, the parentheses indicate the parameter type, and ‘[:]’ represents a one-dimensional array.

So the entire type string represents sum1d() as a one-dimensional array with a double-precision floating-point argument and a return value of a double-precision floating-point number. Note that jIT-generated functions can only perform operations on parameters of the specified type:

print sum1d(np.ones(10, dtype=np.int32))
print sum1d(np.ones(10, dtype=np.float32))
print sum1d(np.ones(10, dtype=np.float64))
1.2095376009 e-312
1.46201599944 e+185
10.0
Copy the code

If you want the JIT to operate on all types of parameters, you can use autoJIT:

from numba import autojit
@autojit
def sum1d2(array) :
    s = 0.0
    n = array.shape[0]
    for i in range(n):
        s += array[i]
    return s

%timeit sum1d2(array)
print sum1d2(np.ones(10, dtype=np.int32))
print sum1d2(np.ones(10, dtype=np.float32))
print sum1d2(np.ones(10, dtype=np.float64))
10000 loops, best of 3: 143 us per loop
10.0
10.0
10.0
Copy the code

Autoit can generate machine code functions dynamically based on the parameter type, but it is also slower because it needs to check the parameter type every time. The use of Numba is simple, basically using jit and autoJIT modifiers, and some type objects. The following program lists all the types supported by NUMba:

print [obj for obj in nb.__dict__.values() if isinstance(obj, nb.minivect.minitypes.Type)]
[size_t, Py_uintptr_t, uint16, complex128, float, complex256, void, int , long double,
unsigned PY_LONG_LONG, uint32, complex256, complex64, object_, npy_intp, const char *,
double, unsigned short, float, object_, float, uint64, uint32, uint8, complex128, uint16,
int.int , uint8, complex64, int8, uint64, double, long double, int32, double, long double,
char, long, unsigned char, PY_LONG_LONG, int64, int16, unsigned long, int8, int16, int32,
unsigned int, short, int64, Py_ssize_t]
Copy the code

Numba parses the AST syntax tree of Python functions through the meta module, adding type information to each variable. Llvmpy is then called to generate machine code, which is finally regenerated into the Python call interface for machine code.

Code words are not easy to nonsense two sentences: need python learning materials or technical questions to exchange “click”

Meta modules

By looking at how Numba works, we can find many useful tools. Meta modules, for example, can convert between program source code, AST syntax trees, and Python binaries. Here’s an example:

def add2(a, b) :
    return a + b
Copy the code

Decompile_func function can be decompiled code object into the ast syntax tree, and str_ast can intuitively show the ast syntax tree, use these two tools learning Python ast syntax tree is very helpful.

from meta.decompiler import decompile_func
from meta.asttools import str_ast
print str_ast(decompile_func(add2))
FunctionDef(args=arguments(args=[Name(ctx=Param(),
                                      id='a'),
                                 Name(ctx=Param(),
                                      id='b')],
                           defaults=[],
                           kwarg=None,
                           vararg=None),
            body=[Return(value=BinOp(left=Name(ctx=Load(),
                                               id='a'),
                                     op=Add(),
                                     right=Name(ctx=Load(),
                                                id='b')))],
            decorator_list=[],
            name='add2')
Copy the code

Python_source converts the AST syntax tree to Python source code:

from meta.asttools import python_source
python_source(decompile_func(add2))
def add2(a, b) :
    return (a + b)
Copy the code

Decompile_pyc combines the two and decompilates Python compiled PYC or PyO files into source code. We will write a tmp.py file and compile it into tmp.pyc with py_compile.

with open("tmp.py"."w") as f:
    f.write(""" def square_sum(n): s = 0 for i in range(n): s += i**2 return s """)
import py_compile
py_compile.compile("tmp.py")
Copy the code

The following call to decompile_pyc displays tmp.pyc as source code: decompile_pyc

with open("tmp.pyc"."rb") as f:
    decompile_pyc(f)
def square_sum(n) :
    s = 0
    for i in range(n):
        s += (i ** 2)
    return s
Copy the code

Llvmpy module

LLVM is a dynamic compiler, and LLvmpy creates machine code dynamically by calling LLVM in Python. Creating machine code directly through LLVmpy can be tedious, such as the following program that creates a function that evaluates the sum of two integers and calls it to evaluate the result.

from llvm.core import Module, Type, Builder
from llvm.ee import ExecutionEngine, GenericValue

# Create a new module with a function implementing this:
#
# int add(int a, int b) {
# return a + b;
#}
#
my_module = Module.new('my_module')
ty_int = Type.int()
ty_func = Type.function(ty_int, [ty_int, ty_int])
f_add = my_module.add_function(ty_func, "add")
f_add.args[0].name = "a"
f_add.args[1].name = "b"
bb = f_add.append_basic_block("entry")

# IRBuilder for our basic block
builder = Builder.new(bb)
tmp = builder.add(f_add.args[0], f_add.args[1]."tmp")
builder.ret(tmp)

# Create an execution engine object. This will create a JIT compiler
# on platforms that support it, or an interpreter otherwise
ee = ExecutionEngine.new(my_module)

# Each argument needs to be passed as a GenericValue object, which is a kind
# of variant
arg1 = GenericValue.int(ty_int, 100)
arg2 = GenericValue.int(ty_int, 42)

# Now let's compile and run!
retval = ee.run_function(f_add, [arg1, arg2])

# The return value is also GenericValue. Let's print it.
print "returned", retval.as_int()
returned 142
Copy the code

F_add is a dynamically generated machine code function, which we can think of as a compiled C function. In the above program, we call this function through ee.run_function, but we can actually get its address and call it from Python’s ctypes module.

Get_pointer_to_function to obtain the address of the f_add function:

addr = ee.get_pointer_to_function(f_add)
addr
2975997968L
Copy the code

Then create a function type with ctypes.pyFuncType:

import ctypes
f_type = ctypes.PYFUNCTYPE(ctypes.c_int, ctypes.c_int, ctypes.c_int)
Copy the code

Finally, convert the function’s address to a callable Python function via f_type and call it:

f = f_type(addr)
f(100.42)
142
Copy the code

All NUMba does is parse the AST syntax tree of Python functions and modify it to add type information. The AST syntax tree with type information is dynamically converted to machine code functions through LLVMPy, and then wrapper functions for machine code functions are created for Python to call using techniques similar to cTypes.

I have spent three days in compiling a set of Python learning tutorials, from the most basic Python scripts to Web development, crawlers, data analysis, data visualization, machine learning, etc. These materials have wanted to friends: Click to pick it up

A library that makes Python 100 times faster, you know?

Meta modules

Llvmpy module

Related Posts

A letter to the graduation project ER

BCH hopes to move forward with other digital currencies

🏆 checking 2020, outlook 2021 | Denver annual essay