What is a 'set' in Python?
A foundational guide to the set, Python's data structure for storing unordered collections of unique elements. Learn how to create sets and perform common mathematical set operations like union and intersection.
In Python, a set is a collection that is both unordered and unindexed. But its most important characteristic is that a set can only contain unique elements. Any duplicate items will be automatically removed.
Sets are modeled on the mathematical concept of a set, and they provide powerful and efficient methods for performing standard set operations like union, intersection, and difference.
Creating a Set
You can create a set by placing a comma-separated sequence of items inside curly braces {}
.
# Create a set from a list of numbers
numbers = {1, 2, 3, 4, 4, 4} # Duplicates are automatically removed
print(numbers) # {1, 2, 3, 4}
To create an empty set, you must use the set()
function. Using empty curly braces {}
will create an empty dictionary, not an empty set.
empty_set = set()
empty_dict = {}
Key Properties of Sets
- Unordered: The items in a set do not have a defined order. You cannot access items by an index.
- Unique: A set cannot have two items with the same value.
- Mutable: You can add and remove items from a set.
Adding and Removing Items
add()
: Adds a single element to the set. If the element is already in the set, it does nothing.my_set = {1, 2, 3} my_set.add(4)
update()
: Adds all the items from another iterable (like a list or another set) to the set.my_set.update([4, 5, 6])
remove()
: Removes a specified element. It will raise aKeyError
if the item is not found.my_set.remove(3)
discard()
: Also removes a specified element, but it will not raise an error if the item is not found.my_set.discard(99) # Does nothing, no error
Common Use Cases
1. Removing Duplicates from a List
This is one of the most common and elegant use cases for a set. You can quickly remove all duplicate items from a list by converting it to a set and then back to a list.
my_list = [1, 2, 2, 3, 4, 4, 5, 5, 5]
# Convert to a set to remove duplicates, then back to a list
unique_list = list(set(my_list))
print(unique_list) # [1, 2, 3, 4, 5]
2. Membership Testing
Checking if an item exists in a set is incredibly fast and efficient (average time complexity of O(1)). This is much faster than checking for an item in a list (O(n)).
my_set = {1, 2, 3, 4, 5}
print(3 in my_set) # True
print(10 in my_set) # False
If you need to frequently check for the existence of items in a large collection, a set is a much better choice than a list.
Set Operations
Sets support standard mathematical operations.
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
Union (
|
): Returns a new set containing all items from both sets.a | b # {1, 2, 3, 4, 5, 6}
Intersection (
&
): Returns a new set containing only the items present in both sets.a & b # {3, 4}
Difference (
-
): Returns a new set containing items in the first set but not in the second set.a - b # {1, 2}
Symmetric Difference (
^
): Returns a new set with items in either set, but not both.a ^ b # {1, 2, 5, 6}
Conclusion
While lists and dictionaries are often the go-to collection types, Python's set
provides a powerful and efficient tool for specific tasks. Its ability to enforce uniqueness and perform high-speed membership tests makes it perfect for removing duplicates and checking for existence. Furthermore, its support for mathematical set operations provides a clean and readable syntax for comparing and combining collections.