The Python documentation describes the IterTools module as “efficient loops.” Some tools improve performance, while others are not fast, save development time, and make code less readable if abused. We might as well pull out the itertools brothers. \

1. Sequence accumulation

Given a list An, return the summation and Sn. For example:

Input: [1, 2, 3, 4, 5]

Return: [1, 3, 6, 10, 15]

Using accumulate, the performance was improved 2.5 times

from itertools import accumulate

def _accumulate_list(arr):
    tot = 0
    for x in arr:
        tot += x
        yield tot

def accumulate_list(arr):
    return list(_accumulate_list(arr))

def fast_accumulate_list(arr):
    return list(accumulate(arr))

arr = list(range(1000))

%timeit accumulate_list(arr)
61(including s -2.91µs per loop (mean ± runs, 10000 loops each)

%timeit fast_accumulate_list(arr)
21.3(including s -811Ns per loop (mean ± runs, 10000 loops each)
Copy the code

2. Select data

Given a list data, a list selectors, represented by 0/1, returns the selected data. For example:

Input: [1, 2, 3, 4, 5], [0, 1, 0, 1, 0]

Return: [2, 4]

With COMPRESS, the performance is 2.8 times better

from itertools import compress
from random import randint

def select_data(data, selectors):
    return [x for x, y in zip(data, selectors) if y]

def fast_select_data(data, selectors):
    return list(compress(data, selectors))

data = list(range(10000))
selectors = [randint(0.1for _ in range(10000)]

%timeit select_data(data, selectors)
341(including s -17.8µs per loop (mean ± runs, 1000 loops each)

%timeit fast_select_data(data, selectors)
130(including s -3.19µs per loop (mean ± runs, 10000 loops each)
Copy the code

Combination of 3.

Given a list arr and a number k, return all cases in which k elements are selected from the ARR. For example:

Input: [1, 2, 3], 2

Returns: [(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2)]

Permutations improved performance tenfold

from itertools import permutations

def _get_permutations(arr, k, i):
    if i == k:
        return [arr[:k]]
    res = []
    for j in range(i, len(arr)):
        arr_cpy = arr.copy()
        arr_cpy[i], arr_cpy[j] = arr_cpy[j], arr_cpy[i]
        res += _get_permutations(arr_cpy, k, i + 1)
    return res

def get_permutations(arr, k):
    return _get_permutations(arr, k, 0)

def fast_get_permutations(arr, k):
    return list(permutations(arr, k))

arr = list(range(10))
k = 5

%timeit -n 1 get_permutations(arr, k)
15.5Ms + / -1.96Ms per loop (mean ± runs, 1 loop each)

%timeit -n 1 fast_get_permutations(arr, k)
1.56Ms + / -284µs per loop (mean ± runs, 1 loop each)
Copy the code

4. Filter the data

Given a list arr, filter out all even numbers. For example:

Input: [3, 1, 4, 5, 9, 2]

Return: [(4, 2]

With FilterFalse, performance slows down, so don’t trust IterTools.

from itertools import filterfalse

def get_even_nums(arr):
    return [x for x in arr if x % 2= =0]

def fast_get_even_nums(arr):
    return list(filterfalse(lambda x: x % 2, arr))

arr = list(range(10000))

%timeit get_even_nums(arr)
417(including s -18.8µs per loop (mean ± runs, 1000 loops each)

%timeit fast_get_even_nums(arr)
823(including s -22.6µs per loop (mean ± runs, 1000 loops each)
Copy the code

5. Termination of conditions

Given a list arr, sum all the numbers in the list in turn. If an element is greater than target, terminate the sum and return the sum. For example:

Input: [1, 2, 3, 4, 5], 3

Return: 6 (4 > 3, terminate)

With takeWhile, performance slows down, so don’t trust IterTools.

from itertools import takewhile

def cond_sum(arr, target):
    res = 0
    for x in arr:
        if x > target:
        res += x
    return res

def fast_cond_sum(arr, target):
    return sum(takewhile(lambda x: x <= target, arr))

arr = list(range(10000))
target = 5000

%timeit cond_sum(arr, target)
245(including s -11.8µs per loop (mean ± runs, 1000 loops each)

%timeit fast_cond_sum(arr, target)
404(including s -13.3µs per loop (mean ± runs, 1000 loops each)
Copy the code

6. Loop nesting

Given the list arr1, arr2, return the sum of all elements of the two lists. For example:

Input: [1, 2], [4, 5]

Return: [1 + 4, 1 + 5, 2 + 4, 2 + 5]

With Product, performance was improved 1.25 times.

from itertools import product

def _cross_sum(arr1, arr2):
    for x in arr1:
        for y in arr2:
            yield x + y

def cross_sum(arr1, arr2):
    return list(_cross_sum(arr1, arr2))

def fast_cross_sum(arr1, arr2):
    return [x + y for x, y in product(arr1, arr2)]

arr1 = list(range(100))
arr2 = list(range(100))

%timeit cross_sum(arr1, arr2)
484(including s -16.6µs per loop (mean ± runs, 1000 loops each)

%timeit fast_cross_sum(arr1, arr2)
373(including s -11.4µs per loop (mean ± runs, 1000 loops each)
Copy the code

7. Switch from 2-D lists to 1-D lists

Given a two-dimensional list ARR, the conversion to a one-dimensional list is illustrated as follows:

Input: [[1, 2], [3, 4]]

Return: [1, 2, 3, 4]

Performance improved by 6 times with chain.

from itertools import chain

def _flatten(arr2d):
    for arr in arr2d:
        for x in arr:
            yield x

def flatten(arr2d):
    return list(_flatten(arr2d))

def fast_flatten(arr2d):
    return list(chain(*arr2d))

arr2d = [[x + y * 100 for x in range(100)] for y in range(100)]

%timeit flatten(arr2d)
379(including s -15.4µs per loop (mean ± runs, 1000 loops each)

%timeit fast_flatten(arr2d)
66.9(including s -3.43µs per loop (mean ± runs, 10000 loops each)
Copy the code

Author: Li Xiaowen, engaged in data analysis and data mining work successively, mainly developed the language Python, and now works as an algorithm engineer in a small Internet company.


Read more

Slow application? You are probably writing fake Python\

Increase the performance of Pandas DataFrame by 40 times

Build a professional-looking GUI with PyQt (part 1) \

Special recommendation \

Programmer’s guide to fish

For your selection of Silicon Valley geeks,

From FLAG giant developers, technology, venture capital first-hand news


Click below to read the article and join the community