25th Nov 2021 10 minutes read

An Introduction to Combinatoric Iterators in Python

python

Combinatoric iterators are tools that provide building blocks to make code more efficient. This introduction shows you some of the most useful ones in Python.

Counting Things

In this article, I would like to provide a brief introduction to combinatoric iterators in Python.

Combinatorics in the mathematical sense is about counting things. It can help us count the number of permutations of something (how many possible arrangements of a deck of cards) or the number of combinations (how many unique arrangements of different colored balls). For this to work, we need a collection of objects to act on – something to iterate through.

In Python, iterable objects, more commonly referred to as iterables, are groups of data. Some common iterables you may be familiar with are lists, tuples, sets, and arrays, which you can iterate through using a for loop. These iterables are commonly filled with integer values, floats, or strings. A string itself is an iterable since you can loop through all the characters in it. A related concept is an iterator, which is an object that returns the next element of an iterable.

Putting these two pieces together, we end up with combinatoric iterators. They help you count things: for example, different combinations of numbers in a list or different permutations of a string. The functionality to help you do all this is provided in the itertools module, which comes with the default installation of Python.

Before we get into the details of combinatoric iterators in Python, it’s worth taking a closer look at how to iterate through an iterable. If you're a complete beginner to Python, check out this course which is designed to cater to people with no programming experience.

Iterators, Iterables, and Iteration

We have said iterables are groups of data – a list of integers, for example. But to get the individual elements in a list, we need an iterator. If you’re interested in the details, check out the Python documentation. We can define a list with some integer values as follows:

x = [1, 2, 3]

It’s important to note that when you do this, the entire list is saved into memory. To iterate through this list, the standard approach is to use a for loop, but there’s another way using some of Python's lesser-known built-in functions, specifically iter() and next(). You can define the iterable directly in the iter() method and print the elements as follows:

>>> x_iterator = iter([1, 2, 3])
>>> print(next(x_iterator))
1
>>> print(next(x_iterator))
2
>>> print(next(x_iterator))
3
>>> print(next(x_iterator))
StopIteration

Here, we have created an iterator x_iterator with type <class 'list_iterator'>, out of the iterable [1, 2, 3] with type <class 'list'>. This iterator can be thought of as a stream of integers coming one after the other. To access the integers, we use the built-in next() method to iterate through it, one value at a time. Once accessed, the integer is removed from the stream, and the iteration count is stored as an internal variable, which allows the iterator to remember its place when the next() method is called again. Once the iteration is finished, it raises a StopIteration exception, since all the elements have been removed. This means the iterator can be traversed only once.

At this stage, you may be wondering how for loops relate to all of this, since this is how iteration is normally done. Indeed, a for loop is a type of iterator. Before a for loop is executed, an iterator object is created in the background, then the iteration is performed until the StopIteration exception arises. For those of you who need a refresher on for loops, check out this article. So, with a for loop, the iteration can be achieved by:

>>> x_iterator = iter([1, 2, 3])
>>> for element in x_iterator:
...    print(element)

Notice that we have to redefine x_iterator since we have already hit the StopIteration in the first example. That’s the difference between iterating through the list x directly and iterating through x_iterator. The whole list x is stored in memory and can be repeatedly iterated through, whereas x_iterator is a stream of integers that can be iterated through just once. Therefore, using x_iterator is more efficient, and this really starts to pay off when dealing with a large amount of data.

The itertools Iterators

As the name suggests, the itertools module provides tools for working with iterables and iterators. You can find the documentation here. There are many functions in this module, all of which fall under one of three categories: infinite iterators (think of a while loop), terminating iterators (think of a for loop), and combinatoric iterators (counting things).

They are designed to be memory efficient, so the functions in this module return iterators that provide the results in a stream of data. Since data is produced only when it is needed, iterables don’t need to be stored in memory. This can be a little confusing, so let’s see some concrete examples of how to call these functions and retrieve the results. The functions we’ll look at are of the combinatoric variety and can be useful for making your code more efficient.

product()

The first of the itertools functions we’ll look at is product(), which implements the Cartesian product of two iterables. The mechanics of how this works is illustrated below in the figure and amounts to creating a 2D array from two 1D vectors. The inputs can be any iterable, and the output is given as a list of tuples. If you want a real array, you have to recast the output, for example using NumPy.

Cartesian product of (x, y, z) x (1, 2, 3)

To implement this in Python, simply call the function from itertools as below:

>>> result = itertools.product(['x', 'y', 'z'], [1, 2, 3])

The result variable is now an iterator with type <class 'itertools.product'>. This is in essence the same as the x_iterator from the earlier example, and as such, can be iterated through only once using either a for loop or the next() method. Alternatively, you can recast it as a list, at which point the whole result is stored in memory and can be iterated through multiple times.

>>> result_list = list(result)

Notice also the inputs are a list of strings and a list of integers. The resulting list of tuples maintains these data types. This function can also be used to compute the Cartesian product of an iterable with itself, using the optional repeat argument. The two lines below give the same result:

>>> itertools.product(['x', 'y', 'z'], repeat=2)
>>> itertools.product(['x', 'y', 'z'], ['x', 'y', 'z'])

Try solving this problem without the itertools library and see what you come up with. The most obvious solution is to use two for loops, and loop through every element in both lists, requiring 3 lines of code. Instead, this simple problem is solved much more efficiently with the product() function.

permutations()

A permutation is an arrangement of objects in a particular order. Remember the deck of cards example from the introduction? How many different arrangements exist in a deck of 52 cards, and what do they look like?

Practically speaking, calculating these permutations in Python is not straightforward. For 52 cards, there are 52! (roughly 8 x 10⁶⁷) permutations. This is such a large number that, whenever you pick up a well-shuffled deck of cards, you’re probably holding an arrangement that has never existed before and will never exist again! So please don’t try to calculate this at home – if you do, your computer won’t thank you for it.

Consider a more tractable problem, where we can calculate permutations in Python. How many possible arrangements are there of three different colored balls, and what do they look like?

>>> balls = itertools.permutations(['red', 'green', 'blue'])
>>> for permutation in balls:
...     print(permutation)
...
('red', 'green', 'blue')
('red', 'blue', 'green')
('green', 'red', 'blue')
('green', 'blue', 'red')
('blue', 'red', 'green')
('blue', 'green', 'red')

There are 3! = 3 x 2 x 1 = 6 permutations. This can also be calculated by recasting balls as a list and getting the length with the len() built-in function. Try for yourself. If you want to learn more about some of the most useful built-in functions in Python, check out this course.

combinations()

The next function provides functionality for calculating combinations in Python. This differs slightly from permutations in that the ordering of items is not important when considering combinations. There is an additional keyword argument, r, which defines the length of combinations to find.

Let’s take another look at our example with the colored balls and add a yellow one to the list:

>>> balls = itertools.combinations(['red', 'green', 'blue', 'yellow'], r=3)

The keyword r=3 says we’re interested in considering combinations of 3 balls, of which there are 4, as shown below:

>>> for combination in balls:
...     print(combination)
...
('red', 'green', 'blue')
('red', 'green', 'yellow')
('red', 'blue', 'yellow')
('green', 'blue', 'yellow')

combinations_with_replacement()

As the name suggests, this next function is similar to combinations(), but it allows the items to be repeated more than once. This results in many more possible combinations, so we’ll just show a subset below, but check the full list yourself:

>>> balls = itertools.combinations_with_replacement(['red', 'green', 'blue', 'yellow'], r=3)
>>> for combination in balls:
...     print(combination)
...
('red', 'red', 'red')
('red', 'red', 'green')
('red', 'red', 'blue')
...
('blue', 'blue', 'yellow')
('blue', 'yellow', 'yellow')
('yellow', 'yellow', 'yellow')

A Coding Challenge

The examples above with colored balls demonstrate how some of the itertools functions work, but they are a little dry. So, it’s time for a more relevant example.

When you apply for programming jobs, hiring managers often send a coding challenge to applicants to test their skills. For those of you on the lookout for tech jobs, here’s a useful article. Let’s consider a more interesting problem you may encounter during your next job application and see how itertools can be applied.

Problem Statement: “How many ways can you change a $100 note using any number of $50, $20, and $10 notes?”

A naive approach is to manually generate combinations of 2 notes, 3 notes, 4 notes, etc., and check if they sum to 100. This is error-prone and looks like a salad of for loops, while loops, and if-else statements. But recognizing that the largest possible number of notes is 10 ($10 x 10 = $100) and that the phrase "any number" implies replacement, you can generate a more efficient solution that looks like this:

>>> notes = [50, 20, 10]
>>> result = []
>>> for i in range(1, 11):
...     for combination in itertools.combinations_with_replacement(notes, i):
...         if sum(combination) == 100:
...             result.append(combination)
...
>>> print(len(result))
10

As we have shown, using itertools can help calculate Cartesian products, permutations, and combinations in Python. It greatly simplifies your code by reducing your reliance on loops and conditional statements. What could take multiple lines can be done in just one. This makes code simpler, more readable, and more efficient.

This is already a win, but the next level comes when you start using itertools functions as building blocks to create combined expressions for more complicated iteration-based algorithms. Python also has some built-in iterators which can be used in conjunction with itertools to realize this next level of programming.

Do You Want More Itertools?

We’ve only talked about 4 functions from this library. It’s worth taking a look at the itertools documentation to get a better idea of what this library can do. If you’re interested in getting your hands on more itertools, try the aptly named more-itertools module. This one doesn’t come with Python – so you’ll have to install it yourself, but it’s full of useful functions that will surely make life easier for you at some point in your Python journey.

Tags:

python