1. Use C++ on Jupyter Notebook

  • First load the Cython extension, using the magic command%load_ext Cython
  • Next, run the Cython code, using the magic command%%cython --cplus
  • If using MacOS, use the magic command%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++For details, please refer tohttps://stackoverflow.com/questions/57367764/cant-import-cpplist-into-cython
%load_ext Cython %% Cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++ #'cimport'Rather than'import'
from libcpp.string cimport string
cdef string s
s = b"Hello world!"
print(s.decode("utf-8"))
Copy the code
Hello world!
Copy the code

2. C++ and Python type conversions

Use C++ STL

3.1 using C++ Vector

Can replace Python’s List.

  1. Initialization – Initialization via Python’s iterable requires declaring nested types of variables
  2. Traversal – Increments index, traversing through the while loop
  3. Access – Access elements using the ‘[]’ operator as in Python
  4. Append – similar to the Python list append method, appends elements using the C++ Vector push_back method

Finally, we compared performance by implementing the Python and C++ versions of the element-counting function separately, and C++ was about 240 times faster.

Note: To be fair, the function does not pass in arguments, but accesses variables directly outside the function body. Avoid counting the time it takes the C++ version to convert Python lists into C++ vectors. If you factor in this time, the C++ version is about four times faster.

%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++ from libcpp.vector cimport vector # Initialize a cdef vector from a Python object [int] vec = range(5# pass through cdef:int i = 0
    int n = vec.size()
print("Start walking..."While I < n: # accessprint("\t The element in position %d is %d" % (i, vec[i]))
    i += 1
print() # append vec.push_back()5)
print("Vec becomes" after append elements, vec)
Copy the code
Start traversing... The first0The element of position 1 is 10
        第1The element of position 1 is 11
        第2The element of position 1 is 12
        第3The element of position 1 is 13
        第4The element of position 1 is 14After the append element, vec becomes [0.1.2.3.4.5]
Copy the code
arr = [x // 100 for x in range(1000)]
target = 6

def count_py():
    return sum(1 for x in arr if x == target)

print("Implement in Python and calculate %d!"% count_py())
Copy the code
Implemented in Python, the calculation result is100!
Copy the code
%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
from libcpp.vector cimport vector

cdef:
    int target = 6
    vector[int] v = [x // 100 for x in range(1000)]

cdef int _count_cpp():
    cdef:
        int i = 0
        int n = v.size()
        int ret = 0
    while i < n:
        if v[i] == target:
            ret += 1
        i += 1
    return ret

def count_cpp():
    return _count_cpp()

print("Use Cython(C++) and calculate %d!"% count_cpp())
Copy the code
Using Cython(C++), the result is calculated as100!
Copy the code
print("Comparing the performance of the Python version with the C++ version...")
%timeit count_py()
%timeit count_cpp()
Copy the code
Comparing the performance of the Python version with the C++ version...30.8(including s -254Ns per loop (mean ± std.dev.of7 runs, 10000 loops each)  
130Ns -6.4Ns per loop (mean ± std.dev.of7 runs, 10000000 loops each)
Copy the code

3.2 using C++ Unordered Map

Dict can replace Python.

  1. Initialization – Initialization via Python’s iterable requires declaring nested types of variables
  2. Iterate – Let the generic pointer increment, iterate through the while loop
  3. Access – dereference with deref(the ‘*’ operator in C++), return pair,.first to access key,.second to access Value
  4. Find – use unordered_map.count, return 1 or 0; Or use unordered_map.find, which returns a generic pointer to unordered_map.end, which is not found.
  5. Append/modify -unordered_map [key] = value. If the Key does not exist, the ‘[]’ operator adds a Key and assigns the default Value, such as 0.0. Therefore, check whether the Key exists before modifying the Value corresponding to the Key unless you are sure that no error will occur. This is somewhat similar to Python’s DecaultDict.

Finally, we compare performance by implementing the Python and C++ versions of the map conditional summation functions respectively, and C++ is about 40 times faster.

%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++ from cython.operator cimport dereference as Deref, preincrement as inc from libcpp. Unordered_map cimport unordered_mapint, float] mymap = {i: i/10 for i in range(10Cdef: unordered_map[int, float].iterator it = mymap.begin()
    unordered_map[int, float].iterator end = mymap.end()
print("Start walking...") while it ! = end: # accessprint("\tKey is %d, Value is %.1f" % (deref(it).first, deref(it).second))
    inc(it)
print(#)print("Start looking...")
if mymap.count(2 -) :print("\t element -2 exists!")
else:
    print("\t element -2 does not exist!")

it = mymap.find(3)
ifit ! = end:print("The \t element 3 exists and its value is %.1f!" % deref(it).second)
else:
    print("\t element 3 does not exist!")
print(#)print("Modify elements...")
if mymap.count(3):
    mymap[3] + =1.0
mymap[2 -]  # Key 2 -Does not exist, a default value will be added0.0
print("\tKey is 3, Value is %.1f" % mymap[3])
print("\tKey is -2, Value is %.1f" % mymap[2 -])
Copy the code
Start traversing... Key is0, Value is 0.0
        Key is 1, Value is 0.1
        Key is 2, Value is 0.2
        Key is 3, Value is 0.3
        Key is 4, Value is 0.4
        Key is 5, Value is 0.5
        Key is 6, Value is 0.6
        Key is 7, Value is 0.7
        Key is 8, Value is 0.8
        Key is 9, Value is 0.9Start looking for... The element2 -Does not exist! The element3Yes, it has a value of0.3! Modify elements... Key is3, Value is 1.3
        Key is 2 -, Value is 0.0
Copy the code
my_map = {x: x for x in range(100)}
target = 50

def sum_lt_py():
    return sum(my_map[x] for x in my_map if x < target)

print("Implement in Python and calculate %d!"% sum_lt_py())
Copy the code
Implemented in Python, the calculation result is1225!
Copy the code
%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
from libcpp.unordered_map cimport unordered_map
from cython.operator cimport dereference as deref, preincrement as inc

cdef:
    unordered_map[int.int] my_map = {x: x for x in range(100)}
    int target = 50

cdef _sum_lt_cpp():
    cdef:
        unordered_map[int.int].iterator it = my_map.begin()
        intret while it ! = my_map.end():if deref(it).first < target:
            ret += deref(it).second
        inc(it)
    return ret

def sum_lt_cpp():
    return _sum_lt_cpp()
Copy the code
print("Use Cython(C++) and calculate %d!"% sum_lt_cpp())
Copy the code
Using Cython(C++), the result is calculated as1225!
Copy the code
print("Comparing the performance of the Python version with the C++ version...")
%timeit sum_lt_py()
%timeit sum_lt_cpp()
Copy the code
Comparing the performance of the Python version with the C++ version...6.63(including s -183Ns per loop (mean ± std.dev.of7 runs, 100000 loops each)
    157Ns -3.13Ns per loop (mean ± std.dev.of7 runs, 10000000 loops each)
Copy the code

3.3 using C++ Unordered Set

Can replace Python’s Set.

  1. Initialization – Initialization via Python’s iterable requires declaring nested types of variables
  2. Iterate – Let the generic pointer increment, iterate through the while loop
  3. Access – use deref(the ‘*’ operator in C++) to dereference
  4. Find – use unordered_set.count, return 1 or 0
  5. Append – With unordered_set.insert, the element will not be appended if it already exists
  6. Intersection, union, difference – As far as I know, the unordered_set operations need to be implemented by the developer and are not as easy to use as Python’s Set.

Finally, we evaluate the intersection comparison performance by implementing Python and C++ versions of set respectively. C++ is about 20 times slower. Details refer to https://stackoverflow.com/questions/54763112/how-to-improve-stdset-interp-performance-in-c if only two sets of the same element number, C++ is about six times better than Python. Predictably, C++ ‘s unordered set queries are fast but slow to create.

%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++ from cython.operator cimport dereference as Deref, preincrement as inc from libcpp. Unordered_set cimport unordered_setint] myset = {i for i in range(5Cdef: unordered_set[int].iterator it = myset.begin()
    unordered_set[int].iterator end = myset.end()
print("Start walking...") while it ! = end: # accessprint("\tValue is %d" % deref(it))
    inc(it)
print(#)print("Start looking...")
if myset.count(2 -) :print("\t element -2 exists!")
else:
    print("\t element -2 does not exist!")

print# additional ()print("Append elements...")
myset.insert(0)
myset.insert(- 1)

print("\tMyset is: ", myset)
Copy the code
Start traversing... Value is0
        Value is 1
        Value is 2
        Value is 3
        Value is 4Start looking for... The element2 -Does not exist! Append elements... Myset is: {0.1.2.3.4.- 1}
Copy the code
myset1 = {x for x in range(100)}
myset2 = {x for x in range(50.60)}

def interp_py():
    return myset1 & myset2

print("Implemented in Python, the result is %s!"% interp_py())
Copy the code
In Python, the result is {50.51.52.53.54.55.56.57.58.59}!
Copy the code
%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
from cython.operator cimport dereference as deref, preincrement as inc
from libcpp.unordered_set cimport unordered_set

cdef:
    unordered_set[int] myset1 = {x for x in range(100)}
    unordered_set[int] myset2 = {x for x in range(50.60)}

cdef unordered_set[int] _interp_cpp():
    cdef:
        unordered_set[int].iterator it = myset1.begin()
        unordered_set[int] ret while it ! = myset1.end():if myset2.count(deref(it)):
            ret.insert(deref(it))
        inc(it)
    return ret

def interp_cpp():
    return _interp_cpp()

print("Use Cython(C++) to implement %s!"% interp_cpp())
Copy the code
Using Cython(C++), the result is {50.51.52.53.54.55.56.57.58.59}!
Copy the code
print("Comparing the performance of the Python version with the C++ version...")
%timeit interp_py()
%timeit interp_cpp()
Copy the code
Comparing the performance of the Python version with the C++ version...244Ns -2.96Ns per loop (mean ± std.dev.of7 runs, 1000000 loops each)
    4.87(including s -100Ns per loop (mean ± std.dev.of7 runs, 100000 loops each)
Copy the code
myset1 = {x for x in range(100)}
myset2 = {x for x in range(50.60)}

def count_common_py():
    return len(myset1 & myset2)

print("Use Python(C++) and calculate %s!"% count_common_py())
Copy the code
Using Python(C++), the calculation result is10!
Copy the code
%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
from cython.operator cimport dereference as deref, preincrement as inc
from libcpp.unordered_set cimport unordered_set

cdef:
    unordered_set[int] myset2 = {x for x in range(100)}
    unordered_set[int] myset1 = {x for x in range(50.60)}

cdef int _count_common_cpp():
    if myset1.size() > myset2.size():
        myset1.swap(myset2)
    cdef:
        unordered_set[int].iterator it = myset1.begin()
        int ret = 0while it ! = myset1.end():if myset2.count(deref(it)):
            ret += 1
        inc(it)
    return ret

def count_common_cpp():
    return _count_common_cpp()

print("Use Cython(C++) to implement %s!"% count_common_cpp())
Copy the code
Using Cython(C++), the result is calculated as10!
Copy the code
print("Comparing the performance of the Python version with the C++ version...")
%timeit count_common_py()
%timeit count_common_cpp()
Copy the code
Comparing the performance of the Python version with the C++ version...276Ns -3.18Ns per loop (mean ± std.dev.of7 runs, 1000000 loops each)
46.2Ns -0.845Ns per loop (mean ± std.dev.of7 runs, 10000000 loops each)
Copy the code

4. Pass values and references

Python functions pass references if they are container-class objects (e.g. List, Set); otherwise they pass values (e.g. Int, float). If you don’t want the function to modify the container-class object, you can use the deepCopy function to make a copy of the container. In C++, the default is to pass values. If you need to pass references, you need to declare them.

In the case of an int Vector, we can see that the value of v1 is not changed by pass_value, but by pass_reference.

  • The value of usingvector[int].pass_valueThe function is just passing in a copy of v1, so the function can’t modify v1
  • Pass reference to usevector[int]&.pass_referenceIf a reference to v1 is passed in, the function can modify v1.

The following two pieces of code show how Python differs from C++.

from copy import deepcopy

def pass_value(v):
    v = deepcopy(v)
    v[0] = - 1

def pass_reference(v):
    v[0] = - 1

v1 = [0.0.0]
print("V1 starts at %s." % v1)
pass_value(v1)
print("After pass_value, v1 is %s" % v1)
pass_reference(v1)
print("After executing the pass_reference function, v1 is %s" % v1)
Copy the code
The initial value of v1 is [0.0.0] after pass_value is executed, the value of v1 is [0.0.0] After pass_reference is executed, the value of v1 is [- 1.0.0]
Copy the code
%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++

from libcpp.vector cimport vector

cdef void pass_value(vector[int] v):
    v[0] = - 1

cdef void pass_reference(vector[int]& v):
    v[0] = - 1

cdef vector[int] v1 = [0.0.0]
print("V1 starts at %s." % v1)
pass_value(v1)
print("After pass_value, v1 is %s" % v1)
pass_reference(v1)
print("After executing the pass_reference function, v1 is %s" % v1)
Copy the code
The initial value of v1 is [0.0.0] after pass_value is executed, the value of v1 is [0.0.0] After pass_reference is executed, the value of v1 is [- 1.0.0]
Copy the code

5. Range of numbers

Python has only ints, and the range of ints can be considered infinite as long as the memory limit is not exceeded, so Python users generally don’t care much about overflow issues. When using C++, however, you need to be careful. The range of C++ numeric types is as follows:

The following functions, for example, cause errors.

%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
def sum_py(num1, num2):
    print("The result by python is:", num1 + num2)

cdef int _sum_cpp(int num1, int num2):  # int -> long int
    return num1 + num2

def sum_cpp(num1, num2):
    print("The result by cpp is:", _sum_cpp(num1, num2))
Copy the code
sum_py(2六四屠杀31- 1.1)
sum_cpp(2六四屠杀31- 1.1)
Copy the code
The result by python is: 2147483648
The result by cpp is: - 2147483648.
Copy the code
%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
from libcpp cimport bool

def lt_py(num1, num2):
    print("The result by python is:", num1 < num2)

cdef bool _lt_cpp(float num1, float num2):  # float -> double
    return num1 > num2

def lt_cpp(num1, num2):
    print("The result by cpp is:", _lt_cpp(num1, num2))
Copy the code
lt_py(1234567890.0.1234567891.0)
lt_cpp(1234567890.0.1234567891.0)
Copy the code
The result by python is: True
The result by cpp is: False
Copy the code

Author: Li Xiaowen, engaged in data analysis and data mining work successively, mainly developed the language Python, and now works as an algorithm engineer in a small Internet company.

Github: github.com/tushushu

Recommended reading

How to change drawing background in Matplotlib

Python interpreter PyPy 7.3.3 released! \

Someone poisoned the code! Use the PIP install command \ with caution

Click below to read the article and join the community

Give it a thumbs up