Back to articles list Articles
5 minutes read

What Are Generators in Python?

Have you ever dealt with datasets so large that your computer's memory couldn’t handle them? Have you ever wondered if there could be a way to interrupt a function in the middle before resuming it? Here is where Python generators come into the picture.

Python generators are a way to create custom iterators. You can read more about iterators here. Before we continue, if you are not familiar with Python terminology, check our articles about Python terms for beginners and more Python terms. And if you’re not comfortable with operations on data structures in Python, you might want to try our Built-In Algorithms in Python course.

You can loop over generator objects as you would with a list. But unlike lists, generators do not store their contents in memory. A widespread use case is when you have to deal with files bigger than your machine's memory can handle, e.g. a large dataset. Trying to open such a file would result in a MemoryError.

By using a Python generator, you can avoid such an issue. But wait! How do you define Python generators?

How to Define Generators in Python

A Python generator is very similar to a regular Python function, but we end it with yield instead of the return keyword. Let's write a quick example using a for loop.

def regular_function(x):
    for i in range(x):
        return i*5

Once executed as regular_function(10), this regular function will return 0, because the execution stops after the first iteration.

However, let's write it a bit differently:

def generator(x):
    for i in range(x):
        yield i*5

The Python yield keyword indicates that we initiated a generator object; it’s here to control the flow of the Python generator. When the program reaches it, the function's execution is paused, and the value from yield is returned.

At this point, the state of the function is saved and the function resumes its execution whenever you call one of the generator's methods. A return statement stops the function altogether.

Therefore, when we run ..


.. we get:

<generator object generator at 0x00000262F8EBB190>

Next, we instantiate the generator object as g:

>>> g = generator(10) 

To execute the generator in Python, we need to use the next() method. In the following example, we include print statements to get some outputs:

>>> print(next(g))
>>> print(next(g))
>>> print(next(g))
>>> print(next(g))

While the next() method is specific to generators in Python, it is not the only way to pause or end a for loop.

Another way to define a generator in Python is to use generator comprehensions. Very similar to list comprehensions, generators comprehensions can be defined as:

gen_comp = (i*5 for i in range(10))

Compared to list comprehensions, generator comprehensions have the advantage of not building and holding the entire object in memory before iteration.  Let's compare a generator composition with a list comprehension:

list_comp = [i*5 for i in range(100000)]
gen_comp = (i*5 for i in range(10000))

These expressions look very similar; the only differences are the brackets and the parenthesis. Even so, they are actually very different. Let's have a look at their size:

>>> import sys 
>>> list_comp
>>> print('list comprehension:', sys.getsizeof(list_comp), 'bytes')
list comprehension: 87616 bytes
>>> gen_comp 
>>> print('generator comprehension:', sys.getsizeof(gen_comp), 'bytes')
generator comprehension: 112 bytes

In this case, the list object is about 782 times larger than the generator object. Therefore, if memory is an issue, you would be better off using a Python generator.

Last but not least, there is no difference between a regular generator and generator comprehensions besides the syntax. The only difference is that generator comprehensions are single-liners.

If you need to define an infinite loop for some reason, you will need to use a Python generator. While your sequence can be infinite, your computer's memory is certainly not.

def infinity():
    n = 0
    while True:
        yield n*n
        n += 13

We initialize a variable n and start an infinite loop. The keyword yield will capture the initial state and imitate the action of range(); finally, we increment n by 13. This program will continue with a for loop until we manually stop it.

In our case, by calling next(), we can manually iterate repeatedly, which is helpful to test the generator to make sure that it produces the expected output. Does this mean that the generator can continue indefinitely?

How to End Generators in Python

First, it can stop naturally. In other words, once all the values have been evaluated, the iteration will stop, and the for loop will exit.

If you use next(), you will get an explicit StopIteration exception.

Another way to end a Python generator is to use the close() method, as follows:

>>> def generator(x):
...    for i in range(x):
...        yield i*5
>>> g = generator(10)

>>> print(next(g))
>>> print(next(g))
>>> print(g.close())

The close() method will throw a GeneratorExit at the yield value and stop the execution of the generator. It can be handy to control the flow of an infinite generator.

Closing Thoughts on Generators in Python

In this article, we learned about generators in Python. We discovered how they could be helpful to deal with memory-intensive computations and how they can provide us with more flexibility over our functions (e.g. when testing an output).

I encourage you to explore this article’s examples further and to check out the Python documentation for more information. Our Python Basics courses can also help new programmers get hands-on coding experience. No previous IT knowledge is required.

Last but not least, do not forget to check our other articles on Happy learning!