In this tutorial, you’ll learn how to use the counter object from Python’s collection module.
When you’re working with long sequences in Python, say, Python lists or strings, you may sometimes need to store the items that appear in the sequence and the number of times they appear.
A Python dictionary is a suitable built-in data structure for such applications. However, Python’s Counter class from the collections module can simplify this—by constructing a counter—which is a dictionary of items and their count in the sequence.
Over the next few minutes, you’ll learn the following:
Use Python’s counter object
Create a Python dictionary to store count values of items in an iterable
Rewrite the dictionary using Python’s counter with a simplified syntax
Perform operations such as updating and subtracting elements, finding intersection between two counter objects
Get the most frequent items in the counter using the most_common() method
Let’s get started!
Python Collections Module and Counter Class
You’ll often use a Python dictionary is to store the items and their count in an iterable. The items and the count are stored as keys and values, respectively.
As the Counter class is part of Python’s built-in collections module, you can import it in your Python script like so:
from collections import Counter
After importing the Counter class as mentioned, you can instantiate a counter object as shown:
<counter_object> = Counter(iterable)
iterable is any valid Python iterable such as Python list, string, or tuple.
In the output, we see that the counter object has been updated to also include the letters and their number of occurrences from another_word.
Subtracting Elements from Another Iterable
Now let’s subtract the value of another_word from letter_count object. To do so, we can use the subtract() method. Using <counter-object>.subtract(<some-iterable>) subtracts the values corresponding to items in <some-iterable> from the <counter-object>.
Notice how you get the keys and the number of occurrences common to the two words. Both ‘renaissance’ and ‘effervescence’ contain two occurrences of ‘e’, and one occurrence each of ‘r’, ‘n’, ‘s’, and ‘c’ in common.
Find the Most Frequent Items Using most_common
Another common operation on the Python counter object is to find the most frequently occurring items.
To get the top k most common items in the counter, you can use the most_common() method on the counter object. Here, we call most_common() on letter_count to find the three most frequently occurring letters.
We see that the letters ‘e’, ‘n’, and ‘a’ occur twice in the word ‘renaissance’.
This is especially helpful if the counter contains a large number of entries and you’re interested in working with the most common keys.
Here’s a quick review of what we’ve learned in tutorial:
The Counter class from Python’s built-in collections module can be used to get a dictionary of count values of all items in any iterable. You should make sure that all the items in the iterable are hashable.
You can update the contents of one Python counter object with contents from another counter object or any other iterable using the update() method with the syntax: counter1.update(counter2). Note that you can use any iterable in place of counter2.
If you want to remove the contents of one of the iterables from the updated counter, you can use the subtract() method: counter1.subtract(counter2).
To find the common elements between two counter objects, you can use the & operator. Given two counters counter1 and counter2, counter1 & counter2 returns the intersection of these two counter objects.
To get the k most frequent items in a counter, you can use the most_common() method. counter.most_common(k) gives the k most common items and the respective counts.
Next, learn how to use default dict, another class in the collections module. You can use default dict instead of a regular Python dictionary to handle missing keys.
Bala Priya C
Bala Priya is a developer and technical writer from India with over three years of experience in the technical content writing space. She shares her learning with the developer community by authoring tech tutorials, how-to guides, and more…. read more
Python is a very versatile language, and Python developers often have to work with a variety of files and get information stored in them for processing. One popular file format you’re bound to encounter as a Python developer is the Portable Document Format popularly known as PDF