Back to articles list Articles
10 minutes read

An In-Depth Guide to Working with Python Sets

Do a deep dive into Python sets with our guide. Master set operations and commonly used set methods and learn relevant use cases for this fundamental data structure.

Have you heard about Python sets? They may not be as famous as other data structures in Python, such as lists and dictionaries. But ignoring Python sets is a mistake: They are just as flexible as lists, can be used to create complex filtering logic, and often outperform other data types. 

In this guide to Python sets, we will start by defining what they are. Then we’ll teach you all about set operations in Python and explore the methods available for Python sets. Finally, we’ll explain how we can leverage sets in some relevant use cases.

This article assumes that you have basic Python knowledge. If that’s not the case, don’t worry! Head over to our Learn Programming with Python track and start learning Python right now. We have over 35 hours of content and hundreds of coding challenges – there’s even dedicated exercises for Python sets and set operations. Don’t miss out on all of this content!

So What Exactly Are Python Sets?

In a nutshell, sets in Python are a mutable collection of unordered, unique immutable elements. Here’s what each of these attributes mean:

  • Being mutable means that Python sets are changeable and do not have a fixed size: you can add and remove elements from a single set.
  • Since Python sets are collections, you can iterate over their elements using a for
  • Being unordered means that you cannot get the first or last element in a set. You also cannot guarantee the order of elements when iterating over a set. Think about a set as a bag full of items: you can add and remove items from the bag, but it doesn’t make sense to ask for the “first” item in it.
  • Since their elements are unique, no two items inside a set are the same; there cannot be duplicate values. In fact, repeatedly adding the exact same item to a set does nothing.
  • While a Python set is a mutable data type, the elements in it must be immutable Therefore, mutable data structures like lists and dictionaries are not allowed inside Python sets.

In the definition above, Python sets may not sound very impressive. But one of their biggest powers lies in combining and filtering elements from multiple sets through the use of set operations.

Performing Set Operations in Python

One of the strongest selling points of Python sets is their set operations. These operations quickly allow you to combine and filter elements from multiple sets in a variety of ways. They’re also blazing fast in Python – much quicker than combining lists by iterating over them, for example.

In order to visualize these operations, let’s consider the two following Python sets:

A = {1, 2, 4, 6, 8}
B = {1, 2, 3, 4, 5}

Note the syntax for defining a set: we use curly braces to enclose the elements in the set, and commas to separate each item. We can represent these sets as a Venn diagram:

sets

Sets A and B represented as a Venn diagram.

Now let’s look at Python’s set operations.

Set Union

Set union simply bundles all elements across all sets together. It can be performed in Python either through the .union() set method or the | (pipe) operator:

A = {1, 2, 4, 6, 8}
B = {1, 2, 3, 4, 5}

# Set union using .union()
print(A.union(B))

# Set union using the | operator
print(A | B)

# Output:
# {1, 2, 3, 4, 5, 6, 8}

Note how the duplicate elements are absent from the result set:

set intersection

Union of sets A and B (shaded area), represented on a Venn diagram.

Set Intersection

The set intersection operator returns only the elements that are common to both sets. We can perform a set intersection in Python using either the .intersection() set method or the & (ampersand) operator:

A = {1, 2, 4, 6, 8}
B = {1, 2, 3, 4, 5}

# Use the method
print(A.intersection(B))

# Use the & operator
print(A & B)

# Output:
# {1, 2, 4}

As expected, only the common elements to A and B are returned after performing a set intersection.

set intersection

Intersection of sets A and B (shaded area), represented on a Venn diagram.

Set Difference

Performing a set difference between sets A and B means returning only the elements present in A that are not present in B. In Python, we can use the .difference() set method or the - (minus) operator for finding the difference between two sets:

A = {1, 2, 4, 6, 8}
B = {1, 2, 3, 4, 5}

# Use the method
print(A.difference(B))

# Use the - operator
print(A - B)

# Output:
# {6, 8}

Note that, unlike other set operations, the order of sets matters when performing a set difference. The difference between A and B is not the same as the set difference between B and A! Here’s the same operations as above, but with A and B swapping places:

A = {1, 2, 4, 6, 8}
B = {1, 2, 3, 4, 5}

# Use the method
print(B.difference(A))

# Use the - operator
print(B - A)

# Output:
# {3, 5}
set difference

Difference of set A with respect to set B (left figure, shaded area), and set B with respect to set A (right figure, shaded area), represented on a Venn diagram.

Set Symmetric Difference

The set symmetric difference operator returns the elements that are unique to either A or B, ignoring the elements that are common to both sets. In Python, we use the .symmetric_difference() set method in order to perform it:

A = {1, 2, 4, 6, 8}
B = {1, 2, 3, 4, 5}

print(A.symmetric_difference(B))

# Output:
# {3, 5, 6, 8}

A set symmetric difference can either be interpreted as the reverse operation to a set intersection or as a set union of the differences between all sets. The Venn diagram makes this clearer:

set symmetric difference

Symmetric difference (shaded area) of sets A and B, represented on a Venn diagram.

With this, we covered the most important set operations in Python. If you want to dive deeper into set operations, head over to our article on set operations with examples. Otherwise, let’s move on to supersets and subsets!

Supersets, Subsets, and Membership Using Python Sets

In addition to the set operations described above, Python sets include a few other useful methods. For instance, it is often useful to find out if a set contains all elements from another set. We can leverage the concept of supersets and subsets for this.

Let’s define two new sets A and B:

A = {1, 2, 3, 4, 5}
B = {1, 2, 4}

We say that set A is a superset of set B, since all elements of B are also in A. For the same reason, set B is said to be a subset of set A.

We can test whether sets in Python are supersets or subsets of each other using the .issuperset() and .issubset(), respectively:

'
A = {1, 2, 3, 4, 5}
B = {1, 2, 4}

print(A.issuperset(B))
# Output: True

print(B.issubset(A))
# Output: True
superset subset

Set A is a superset of set B, and set B is a subset of set A.

If you need to check whether an arbitrary element is in a set, you can simply use the in keyword:

B = {1, 2, 4}
print(1 in B)
# Output: True

print(3 in B)
# Output: False

As a technical aside, an empty set is a subset of all Python sets. Additionally, a set in Python is both a superset and a subset of itself. Both of these facts derive from the definitions of supersets/subsets that we described above.

Creating and Manipulating Sets in Python

There are a few ways to create sets in Python. One way is to define a set literal using curly braces, much like we have been doing so far:

S = {1, 10, 100}

Another alternative is to pass a sequence to the set() function, like a list:

S = set([1, 10, 100])

Note that the set() function is also the only way to initialize an empty set, should you need it:

empty_set = set()

Once a set is created, you can modify it using the .add() and .remove() methods:

S = set()

S.add(2)
print(S)
# Output: {2}

S.remove(2)
print(S)
# Output: set()

Calling the .add() method repeatedly with the same value does nothing (because sets have only unique values). On the other hand, calling the .remove() method with a value that is not in the set will raise a KeyError, so watch out for that.

You can also use the .update() method to add all values from another set:

A = {1, 2, 3}
B = {3, 4, 5}

S = set()

S.update(A)
print(S)
# Output: {1, 2, 3}

S.update(B)
print(S)
# Output: {1, 2, 3, 4, 5}

We actually have an entire article on adding values from a list to a set, so be sure to check that out as well.

The .copy() and .clear() methods can be used to duplicate a set and remove all of its elements, respectively:

S1 = {1, 2, 3}
S2 = S1.copy()  # independent copy of S1
S1.clear()

print(S1)
# Output: set()

print(S2)
# Output: {1, 2, 3}

Finally, you can iterate over sets using a for loop – much like any other sequence:

S = {1, 2, 3, 4, 5}
for number in S:
    print(number)

# Output:
# 1
# 2
# 3
# 4
# 5

But be careful: the order of iteration over the elements is random! Even if sometimes the elements seem to be ordered, as in the example above, Python itself does not guarantee any ordering when iterating over sets. If the order of the elements is crucial for your code, consider switching from sets to lists.

If you’re unsure how for loops work, be sure to read our article on how to write a for loop.

Where Can You Use Python Sets?

While Python sets may seem one of the more esoteric data types in the language, they have legitimate uses in many situations. A common example is to remove duplicates from a list by transforming it into a set and back into a list:

values = [1, 2, 2, 1, 3, 4, 1, 2, 3, 4, 1]
unique_values = list(set(values))

print(unique_values)
# Output: [1, 2, 3, 4]

As desired, the unique_values list is a deduplicated version of the initial values list.

If lists and sets are still very confusing to you, take a look at our article on differences between Python lists, tuples and sets.

Another common application for sets is filtering data. Let’s say that you have three lists of IDs: one of all of your clients, and two others of clients who bought product X or Y:

all_clients = {104, 203, 255, 289, 448}
clients_bought_X = {104, 448}
clients_bought_Y = {104, 255, 289}

Let’s try answering the following questions:

  • Which clients bought product X and did not buy product Y?
  • Which clients bought both products?
  • Which clients bought nothing?

All of these questions can be addressed with set operations:

all_clients = {104, 203, 255, 289, 448}
clients_bought_X = {104, 448}
clients_bought_Y = {104, 255, 289}

# Question 1
print(clients_bought_X.difference(clients_bought_Y))
# Output: {448}

# Question 2
print(clients_bought_X.intersection(clients_bought_Y))
# Output: {104}

# Question 3
purchasers = clients_bought_X.union(clients_bought_Y)
print(all_clients.difference(purchasers))
# Output: {203}

In this simple example, you could probably figure out the answers just by looking at the sets. But if you had thousands of client IDs, you’d be hard pressed to find a better solution than Python sets!

Keep Learning About Sets and Much More!

That was a lot! Hopefully you now have a good grasp on what Python sets are and just how useful and versatile they can be.

But don’t stop now! Head over to our Python Data Structures in Practice course and practice everything you have learned about sets as well as all of Python’s other data structures. And if you’re in need of some guidance in your Python journey, try reading our article on how to improve your Python skills. See you in the next article!