16th Nov 2021 9 minutes read How to Write Custom Sort Functions in Python Xavier Rigoulet python python basics sort In computer science, a sorting algorithm puts elements of a list into a particular order. They are important because they often reduce the complexity of a problem. Let’s discover how to use custom sort functions to implement custom orders and comparisons in Python. In my previous article on working with streams in Python, I briefly introduced sorting methods with list.sort() and sorted(). Both list.sort() and sorted() have a key parameter that specifies a function to be called on each list element before making comparisons. In this article, I want to go further on the sort topic and explore how to write a custom sort function in Python. In other words, I’ll explain how to use a custom lambda function as a key parameter. If you are not comfortable with Python functions, it is a good idea to read How to Define a Function in Python before diving deeper into this article. Sorting With Custom Sort Function in Python First, let’s talk about the difference between sort() and sorted(). In terms of syntax, sort() is an instance method implemented as list_to_sort.sort(), while sorted() is used as sorted(list_to_sort). One important thing to note is that sort() modifies the initial variable directly, and consequently, the initial order will be lost. On the other hand, sorted() keeps a copy of the initial variable, making it possible to revert to the initial order if needed. Because sort() does not make any copy of the initial variable, it is a bit more efficient than sorted(). However, this comes at the cost of convenience. It is also important to note that sorted() will return a list; therefore, you have to assign the output to a new variable. As for list.sort(), it modifies the list in place and has no return value. Last but not least, list.sort() can only work on lists while sorted() accepts any iterable. For example, here’s a case-insensitive string comparison: >>> sorted("LearnPython.com is awesome to learn about custom sort functions in Python".split(), key=str.lower) ['about', 'awesome', 'custom', 'functions', 'in', 'is' 'Learn', 'LearnPython.com', 'Python', 'sort', 'to'] Note: It is common to pass a custom lambda function as a key parameter to sort complex objects in Python. Now, let’s talk about custom sort functions in Python. In Python, we can write custom sort functions that work with sort() and sorted(). The value of the key parameter should be a function that takes a single argument and returns a key for sorting purposes. Because the key function is called only once for each input record, this is an efficient way to perform sorting in Python. A common pattern is to sort complex objects using some of the object’s indices as key. For example, we can define a custom order to sort a list of tuples: >>> pokemon = [ ... ('Charmander', 'Fire', 52), ... ('Blastoise', 'Water', 83), ... ('Beedrill', 'Poison', 90), ... ] >>> sorted(pokemon, key=lambda x: x[2]) # sort by attack power [('Charmander', 'Fire', 52), ('Blastoise', 'Water', 83), ('Beedrill', 'Poison', 90)] It also works for objects with name attributes: >>> class Pokemon: ... def __init__(self, name, category, attack): ... self.name = name ... self.category = category ... self.attack = attack ... def __repr__(self): ... return repr((self.name, self.category, self.attack)) >>> pokemon_objects = [ ... Pokemon('Beedrill', 'Poison', 90), ... Pokemon('Charmander', 'Fire', 52), ... Pokemon('Blastoise', 'Water', 83), ... ] >>> sorted(pokemon_objects, key=lambda x: x.attack) # sort by attack [('Charmander', 'Fire', 52), ('Blastoise', 'Water', 83), ('Beedrill', 'Poison', 90)] You can learn more about custom objects in Python in the article Simple Steps for Creating Your Own Class in Python. Knowing how to manipulate data, write custom sort functions in Python, and perform custom comparisons are essential skills to master. Our Introduction to Python for Data Science is an excellent way to pick up this in-demand skillset. Custom Comparison with Sort Function in Python You can also use sorted() with a custom comparator as its parameter. In Python 2, sorted() can be implemented with a custom comparator, either cmp or the key parameter. It is important to note that cmp needs to pass two parameters (x and y) that are parts of the list. It will return a number with the following logic: If it returns a positive number: x > y If it returns 0: x == y If it returns a negative number: x < y However, key receives a parameter, computes the result, and then makes use of the computation to sort and compare. This means that in Python 2, you could sort a list of numbers by their cube value in two different ways: >>> l = [6, 8, 10, 23, -4, -7] >>> # The cmp parameter has been removed in Python 3 >>> sorted_l = sorted(l, cmp=lambda x, y: x ** 3 - y ** 3) # Sort with cmp >>> sorted_l = sorted(l, key=lambda x: x ** 3) # Sort with key >>> print(sorted_l) [-7, -4, 6, 8, 10, 23] In Python 3, the cmp parameter has been removed, mainly for two reasons. First, everything done with cmp can be done with key. Second, key is faster than cmp. When cmp is passed as a parameter, the sorting algorithm compares pairs of values and the comparing function is called multiple times for every item. On the other hand, key performs the computation only once. Thus, the complexity is reduced. This makes the code less prone to error, as the syntax is simplified. (Before key, it was possible to benefit from it by following the principle of Decorate-Sort-Undecorate, also known as Schwartzian transform.) If you are familiar with Java or C++, you might be more familiar with cmp than key. In fact, in Python 3, you can use cmp with functools.cmp_to_key(func), which will convert cmp to key. Let's explore this more in the next section. Custom Sort Functions in Python with functools.cmp_to_key(func) functools.cmp_to_key(func) is used to transform an old-style comparison function to a key function. It’s available in Python 2.7, Python 3.2, and later. According to the Python 3 documentation, “a comparison function is any callable that accepts two arguments, compares them, and returns a negative number for less-than, zero for equality, or a positive number for greater-than. A key function is a callable that accepts one argument and returns another value to be used as the sort key.” Before Python 2.4, There was no sorted() and list.sort() took no keyword argument. Instead, Python 2 supported a cmp parameter to handle user-specified comparison functions. When porting a code from Python 2 to Python 3, you might have to convert the function from cmp to key. In Python 3, functools.cmp_to_key(func) was introduced to facilitate the process. We will use functools.cmp_to_key(func) with functions that accept key functions such as sorted() or itertools.groupby(), which I talked about in my earlier article. Using our previous example to sort numbers by their cube value, you can write a custom cmp function as follows: >>> import functools >>> l = [6, 8, 10, 23, -4, -7] >>> def compare(x, y): ... return x ** 3 - y ** 3 >>> sorted_l = sorted(l, key=functools.cmp_to_key(compare)) >>> print(sorted_l) [-7, -4, 6, 8, 10, 23] Sometimes, using key might be less obvious than cmp. In this case, it might be better to use functools.cmp_to_key(func), as it can be more readable and intuitive. For example, in last year’s matura (a Polish exam similar to A Levels, Abitur, or Baccalauréat), the optional IT part had an exercise that included this: Pair (number1, word1) is smaller than pair (number2, word2) if: number1 < number2 Or: number1 == number2 and word1 is alphabetically smaller than word2. For example, pair (1, bbbb) is smaller than pair (2, aaa), But pair (3, aaa) is smaller than pair (3, ab). In other words, we want the pair to be sorted in ascending order on the first element and the second element. Therefore, we expect the pairs to be returned in the following order: (1, bbbb), (2, aaa), (3, aaa), (3, ab). Below is a custom cmp function to solve this problem: from functools import cmp_to_key def compare(pair1, pair2): number1, word1 = pair1 number2, word2 = pair2 if number1 == number2: if word1 < word2: return -1 else: return 1 if number1 < number2: return -1 else: return 1 compare_key = cmp_to_key(compare) But even in this case, we can solve the problem with key by sorting a list of tuples: >>> # List of tuples >>> l = [(3, 'aaa'), (1, 'bbbb'), (3, 'ab'), (2, 'aaa')] >>> # Sort with key on first and second element of each tuple >>> sorted(l, key = lambda x: (x[0], x[1])) [(1, 'bbbb'), (2, 'aaa'), (3, 'aaa'), (3, 'ab')] We can also try to make the problem more challenging by sorting the first element in descending order and the second one in ascending order. Again, we can solve it with key: >>> # Sort number in descending order and word in ascending order >>> sorted(l, key = lambda x: (-x[0], x[1])) [(3, 'aaa'), (3, 'ab'), (2, 'aaa'), (1, 'bbbb')] Suppose we turn the problem the other way around, with the first element in ascending order and the second in descending order. In this case, passing the reverse parameter as True will solve it. >>> # Sort number in ascending order and word in descending order >>> sorted(l, key = lambda x: (-x[0], x[1]), reverse=True) [(1, 'bbbb'), (2, 'aaa'), (3, 'ab'), (3, 'aaa')] It is challenging to find a case where cmp cannot be replaced by key. Because performance-wise functools.cmp_to_key(func) is very slow compared to key, it only should be used as a last resort to implement a custom sort function in Python. If you want to know more about mapping functions, look at my article on filter(), map(), and reduce(). Closing Thoughts on Custom Sort Functions in Python In this article, we explored how to implement custom sort and comparison functions in Python. We have learned a bit of Python history and tried to understand the choices made with cmp and key between Python 2 and 3 to implement custom sort functions in Python. To better understand the concepts explained in these articles, it is always a good idea to play with the code snippets and to build your own examples. Finally, if you want to learn more about data manipulation in Python, feel free to check Yigit’s excellent article on How to Filter Rows and Select Columns in a Python Data Frame With Pandas. And if you want to take things to the next level, try our Python for Data Science track. Happy learning! Tags: python python basics sort