This article originated from personal public account: TechFlow, original is not easy, for attention


In this article we will talk about sorting in Python. Like many high-level languages, Python encapsulates sophisticated sorting functions. All we need to do is call the internal sort function. However, in actual scenarios, sorting is often complicated. For example, there are multiple fields in the object type, and we want to sort by the specified field, or we want to sort by multiple keywords. At this time, it cannot be solved by simple function call.


Dictionary order


Let’s start by looking at the most common dictionary sort scenario. Suppose we have an array of dictionaries with multiple fields in the dictionary. We want to be able to sort by a field in the dictionary. Let’s use actual data as an example:

kids = [
    {'name': 'xiaoming'.'score': 99.'age': 12},
    {'name': 'xiaohong'.'score': 75.'age': 13},
    {'name': 'xiaowang'.'score': 88.'age': 15}]Copy the code

Kids is a dict array with name, Score, and age fields. What if we wanted to be able to sort by score?

There are several solutions to this problem. First, we can specify sorting by using the anonymous function mentioned in the previous article. The use of priority queues is the same as in the previous article, so let’s go straight to the code:

sorted(kids, key=lambda x: x['score'])
Copy the code

In the anonymous function, the x we receive is an element from Kids, which is a dict, so if we want to specify the fields we want, we need to use the dict method to access the elements, that is, use brackets to find the value of the corresponding field.

What if we want to sort by multiple keywords?

First of all, we will introduce multi-keyword sorting, or use the above data for example. In the example above, each kid’s score is different, so the ranking result is certain. But if there are two people with equal score, and I want the younger one to rank first, what should I do? If we analyze it, we can find that the original order is from the smallest to the largest, but there may be equal scores. At this point, we want to be able to compare the ages according to the score is equal, that is, we want to sort by two keywords, the first is the score, the second is the age.

Since Python supports tuple and list sorting, which means we can directly compare the size of [1, 3] and [1, 2], Python automatically compares the size of the elements in both arrays at once. If they are equal, they are automatically compared backward until they are unequal or finished.

Once you understand that, it’s actually pretty easy. We just need to modify the anonymous function to add a field to the result it returns.

sorted(kids, key=lambda x: (x['score'], x['age']))
Copy the code


itemgetter


In addition to anonymous functions, Python also has built-in libraries to address this problem. The usage is very similar to anonymous functions and slightly easier to use.

This is the Itemgetter function in the operator library. Let’s go straight to the code:

from operator import itemgetter

sorted(kids, key=itemgetter('score'))
Copy the code

If you have multiple keys, you can also pass multiple keys:

sorted(kids, key=itemgetter('score'.'age'))
Copy the code


Object sorting


Let’s look at the custom ordering of objects. We’ll first write the dict above as an object:

class Kid:
    def __init__(self, name, score, age):
        self.name = name
        self.score = score
        self.age = age

    def __repr__(self):
        return 'Kid, name: {}, score: {}, age:{}'.format(self.name, self.score, self.age)
Copy the code

To make it easier to see the print, we override the __repr__ method, which we can simply treat as Java’s toString method, so that we can specify the output when we print it.

Similarly, operator provides a sorting factor function for the object, which is used the same way as itemgetter but with a different name.

from operator import attrgetter

kids = [Kid('xiaoming'.99.12), Kid('xiaohong'.75.13), Kid('xiaowang'.88.15)]

sorted(kids, key=attrgetter('score'))
Copy the code

We can also do this using the anonymous function lambda:

sorted(kids, key=lambda x: x.score)
Copy the code


Custom sort


This is not the end, because there are still some problems that cannot be solved. Although we have achieved multi-keyword sort, there is still a problem that can not be solved, that is, the order of the sort problem.

We can pass reverse=True in the arguments to the sorted function to control whether it is sorted forward or backward, but what if I use multiple keywords and want to order one keyword in ascending order and one keyword in descending order? For example, if we want to follow the descending order of scores, the ascending order of ages cannot be solved by reverse, which is the problem that cannot be solved at present.

So what should we do?

At this time you need the ultimate sort killing device to play, which is the title of the custom sort. That is, we implement a function that defines the size of an element, and then get sorted to call our function to do the sorting. This is also used in languages like C++ and Java.

Custom functions are not hard to write, so we can just write them:

def cmp(kid1, kid2):
    return kid1.age < kid2.age if kid1.score == kid2.score else kid1.score > kid2.score
Copy the code

If you don’t understand it, it doesn’t matter, I’ll write it in full:

def cmp(kid1, kid2):
    if kid1.score == kid2.score:
        return kid1.age < kid2.age
    else:
        return kid1.score > kid2.score
Copy the code

This function cannot be used directly after it is written, and it is not the same as lambda anonymous functions we mentioned earlier. The previous anonymous function was only used to specify fields, so we can’t pass this function directly to the key, we need to process it on the outside. Functools is a funcTools function that we can call directly. Let’s look at the code:

from functools import cmp_to_key

sorted(kids, key=cmp_to_key(cmp))
Copy the code

Cmp_to_key = cmp_to_key;

def cmp_to_key(mycmp):
    """Convert a cmp= function into a key= function"""
    class K(object):
        __slots__ = ['obj']
        def __init__(self, obj):
            self.obj = obj
        def __lt__(self, other):
            return mycmp(self.obj, other.obj) < 0
        def __gt__(self, other):
            return mycmp(self.obj, other.obj) > 0
        def __eq__(self, other):
            return mycmp(self.obj, other.obj) == 0
        def __le__(self, other):
            return mycmp(self.obj, other.obj) <= 0
        def __ge__(self, other):
            return mycmp(self.obj, other.obj) >= 0
        __hash__ = None
    return K
Copy the code

We can see that inside the function, it actually defines a class, overloads the comparison function in the class, and returns a new object that overloads the comparison function. These __lt__, __gt__ functions are overloaded comparison functions in a class. For example, __lt__ is a less than function and __eq__ is an equal function. So the question is, can we just override the comparison function in Kid so that we can sort directly?

The answer is yes, of course we can, and in fact it’s a very common practice in object orientation. We often prefer to define priorities in our classes rather than custom comparison functions. In Python, we manually implement an __lt__ function. Sorted by default ranks the smaller elements first, so we only implement __lt__ as a function. The argument passed to this function is another object, so we can just write the comparison logic in the function. Returning True indicates that the current object is smaller than other, or larger than other otherwise.

We have attached the complete code:

class Kid:
    def __init__(self, name, score, age):
        self.name = name
        self.score = score
        self.age = age

    def __repr__(self):
        return 'Kid, name: {}, score: {}, age:{}'.format(self.name, self.score, self.age)

    def __lt__(self, other):
        return self.score > other.score or (self.score == other.score and self.age < other.age)
Copy the code

After implementing the comparison function, we call the sorted function directly and sort it without using any other passed arguments.

Although today’s content is not difficult, but in our daily programming is very commonly used, often there will be a need to sort complex objects and content, so I hope you all master, because it will be useful.

That’s all for today’s article. If you feel you have gained something, please scan the code and pay attention to it. Your contribution is very important to me.