In this tutorial, you’ll learn how to use the counter object from Python’s collection module.

When you’re working with long sequences in Python, say, Python lists or strings, you may sometimes need to store the items that appear in the sequence and the number of times they appear.

A Python dictionary is a suitable built-in data structure for such applications. However, Python’s Counter class from the collections module can simplify this—by constructing a counter—which is a dictionary of items and their count in the sequence.

Over the next few minutes, you’ll learn the following:

  • Use Python’s counter object 
  • Create a Python dictionary to store count values of items in an iterable 
  • Rewrite the dictionary using Python’s counter with a simplified syntax
  • Perform operations such as updating and subtracting elements, finding intersection between two counter objects 
  • Get the most frequent items in the counter using the most_common() method

Let’s get started!

Python Collections Module and Counter Class

You’ll often use a Python dictionary is to store the items and their count in an iterable. The items and the count are stored as keys and values, respectively.

As the Counter class is part of Python’s built-in collections module, you can import it in your Python script like so:

from collections import Counter

After importing the Counter class as mentioned, you can instantiate a counter object as shown:

<counter_object> = Counter(iterable)

Here:

  • iterable is any valid Python iterable such as Python list, string, or tuple.
  • The items in the iterable should be hashable.

Now that we know how to use Counter to create counter objects from any Python iterable, let’s start coding.

The examples used in this tutorial can be found in this GitHub gist.

How to Create a Counter Object from Python Iterables

Create-a-Counter-Object-from-Python-Iterables

Let’s create a Python string, say, ‘renaissance’ and call it word.

>>> word = "renaissance"

Our goal is to create a dictionary where each letter in the word string is mapped to the number of times it occurs in the string. One approach is to use for loops as shown:

>>> letter_count = {}
>>> for letter in word:
...     if letter not in letter_count:
...         letter_count[letter] = 0
...     letter_count[letter] += 1
...
>>> letter_count
{'r': 1, 'e': 2, 'n': 2, 'a': 2, 'i': 1, 's': 2, 'c': 1}

Let’s parse what the above code snippet does:

  • Initializes letter_count to an empty Python dictionary.
  • Loops through the word string.
  • Checks if letter is present in the letter_count dictionary.
  • If letter is not present, it adds it with a value of 0 and subsequently increments the value by 1.
  • For each occurrence of letter in word, the value corresponding toletter is incremented by 1.
  • This continues until we loop through the entire string.

We constructed the letter_count dictionary—on our own—using for loop to loop through the string word.

Now let’s use the Counter class from the collections module. We only need to pass in the word string to Counter() to get letter_count without having to loop through iterables.

>>> from collections import Counter
>>> letter_count = Counter(word)
>>> letter_count
Counter({'e': 2, 'n': 2, 'a': 2, 's': 2, 'r': 1, 'i': 1, 'c': 1})

The counter object is also a Python dictionary. We can use the built-in isinstance() function to verify this:

>>> isinstance(letter_count,dict)
True

As seen, isinstance(letter_count, dict) returns True indicating that the counter object letter_count is an instance of the Python dict class.

Modifying the Counter Object

So far, we’ve learned to create counter objects from Python strings.

You can also modify counter objects by updating them with elements from another iterable or subtracting another iterable from them.

Updating a Counter with Elements from Another Iterable

Let’s initialize another string another_word:

>>> another_word = "effervescence"

Suppose we’d like to update the letter_count counter object with the items from another_word string.

We can use the update() method on the counter object letter_count.

>>> letter_count.update(another_word)
>>> letter_count
Counter({'e': 7, 'n': 3, 's': 3, 'c': 3, 'r': 2, 'a': 2, 'f': 2, 'i': 1, 'v': 1})

In the output, we see that the counter object has been updated to also include the letters and their number of occurrences from another_word.

Subtracting Elements from Another Iterable

Now let’s subtract the value of another_word from letter_count object. To do so, we can use the subtract() method. Using <counter-object>.subtract(<some-iterable>) subtracts the values corresponding to items in <some-iterable> from the <counter-object>.

Let’s subtract another_word from letter_count.

>>> letter_count.subtract(another_word)
>>> letter_count
Counter({'e': 2, 'n': 2, 'a': 2, 's': 2, 'r': 1, 'i': 1, 'c': 1, 'f': 0, 'v': 0})

We see that the values corresponding to the letters in another_word have been subtracted, but the added keys ‘f’ and ‘v’ are not removed. They now map to a value of 0.

Note: Here, we have passed in another_word, a Python string, to the subtract() method call. We can also pass in a Python counter object or another iterable.

Intersection Between Two Counter Objects in Python

Intersection-Between-Two-Counter-Objects-in-Python

You may sometimes want to find the intersection between two Python counter objects to identify which keys are common between the two.

Let’s create a counter object, say, letter_count_2, from the another_word string ‘effervescence’.

>>> another_word = "effervescence"
>>> letter_count_2 = Counter(another_word)
>>> letter_count_2
Counter({'e': 5, 'f': 2, 'c': 2, 'r': 1, 'v': 1, 's': 1, 'n': 1})

We can use the simple & operator to find the intersection between letter_count and letter_count_2.

>>> letter_count & letter_count_2
Counter({'e': 2, 'r': 1, 'n': 1, 's': 1, 'c': 1})

Notice how you get the keys and the number of occurrences common to the two words.  Both ‘renaissance’ and ‘effervescence’ contain two occurrences of ‘e’, and one occurrence each of ‘r’, ‘n’, ‘s’, and ‘c’ in common.

Find the Most Frequent Items Using most_common

Another common operation on the Python counter object is to find the most frequently occurring items.

To get the top k most common items in the counter, you can use the most_common() method on the counter object. Here, we call most_common() on letter_count to find the three most frequently occurring letters.

>>> letter_count.most_common(3)
[('e', 2), ('n', 2), ('a', 2)]

We see that the letters ‘e’, ‘n’, and ‘a’ occur twice in the word ‘renaissance’.

This is especially helpful if the counter contains a large number of entries and you’re interested in working with the most common keys.

Conclusion

Here’s a quick review of what we’ve learned in tutorial:

  • The Counter class from Python’s built-in collections module can be used to get a dictionary of count values of all items in any iterable. You should make sure that all the items in the iterable are hashable.
  • You can update the contents of one Python counter object with contents from another counter object or any other  iterable using the update() method with the syntax: counter1.update(counter2). Note that you can use any iterable in place of counter2.
  • If you want to remove the contents of one of the iterables from the updated counter, you can use the subtract() method: counter1.subtract(counter2)
  • To find the common elements between two counter objects, you can use the & operator. Given two counters counter1 and counter2, counter1 & counter2 returns the intersection of these two counter objects.
  • To get the k most frequent items in a counter, you can use the most_common() method. counter.most_common(k) gives the k most common items and the respective counts.

Next, learn how to use default dict, another class in the collections module. You can use default dict instead of a regular Python dictionary to handle missing keys.