In this tutorial, you will learn how to remove duplicate items from Python lists.

When you are working with lists in Python, you may sometimes need to work with only unique items in the list – by removing the duplicates.

There are a few different ways you can do this. In this tutorial, we’ll go over five such techniques.

Basics of Python Lists

Let’s start our discussion by reviewing the basics of Python lists.

Python lists are mutable. So you can modify them in place by adding and removing elements from the list. In addition, Python lists are collections of elements not necessarily unique.

So how do you retain only the unique elements and remove the duplicate or repeating elements?

Well, you can do this in a few different ways. You can either create a new list that contains only the unique elements in the original list. Or you could choose to modify the original list in place and remove the duplicate items.

We will learn these in detail in this tutorial.

Methods to Remove Duplicates from Python Lists

Let’s take a real-world example. Suppose you’re at your friend’s birthday party.🎊🎉

In the collection of sweets displayed, you see there are some items that are repeated. You’d now like to remove those duplicate items from the list of sweets.

remove-duplicate-from-list

Let’s create a sweets list containing all the items in the image above.

sweets = ["cupcake","candy","lollipop","cake","lollipop","cheesecake","candy","cupcake"]

In the above sweets list, the items’ candy’ and ‘cupcake’ are repeated twice. Let’s use this example list to remove the duplicate items.

Iterate over Python Lists to Remove Duplicates

The most straightforward method is to create a new list that contains each item exactly once. 

Read through the code cell below:

unique_sweets = []
for sweet in sweets:
  if sweet not in unique_sweets:
    unique_sweets.append(sweet)

print(unique_sweets)

# Output
['cupcake', 'candy', 'lollipop', 'cake', 'cheesecake']
  • We initialize an empty list unique_sweets.
  • While looping through the sweets list, we access each sweet.
  • If sweet is not already present in the unique_sweets list, we add it to the end of unique_sweets list using the .append() method.

Suppose you come across a repeating item, for example, the second occurrence of ‘candy’ in the sweets list. This is not added to the unique_sweets list as it’s already present: sweet not in unique_sweets evaluates to False for the second occurrence of ‘cupcake’ and ‘candy’.

Therefore, in this method, every item occurs exactly once in the unique_sweets list—without any repetition.

Use List Comprehension to Remove Duplicates

You can also use list comprehension to populate the unique_sweets list.

Want to refresh the basics of list comprehension?

▶️ Check out the tutorial on list comprehension in Python.

Let’s use the list comprehension expression: [output for item in iterable if condition is True] to rewrite the above looping concisely.

unique_sweets = []
[unique_sweets.append(sweet) for sweet in sweets if sweet not in unique_sweets]
print(unique_sweets)

# Output
['cupcake', 'candy', 'lollipop', 'cake', 'cheesecake']

Even though you’re creating a new list, you are not populating the created list with values. This is because the output is the .append() operation to the unique_sweets list.

To remove duplicate items from Python lists, you can also use built-in list methods, and we’ll cover this in the next section.

Use Built-in List Methods to Remove Duplicates

You can use the Python list methods .count() and .remove() to remove duplicate items.

– With the syntax list.count(value), the .count() method returns the number of times value occurs in list. So the count corresponding to repeating items will be greater than 1.

list.remove(value) removes the first occurrence of value from the list.

Using the above, we have the following code.

for sweet in sweets:
  # check if the count of sweet is > 1 (repeating item)
  if sweets.count(sweet) > 1:
  # if True, remove the first occurrence of sweet
    sweets.remove(sweet)

print(sweets)

# Output
['cake', 'lollipop', 'cheesecake', 'candy', 'cupcake']

Since the .remove() method removes only the first occurrence of a value, you cannot use it to remove items that occur more than twice.

  • If a particular item is duplicated (occurs exactly twice), this method removes the first occurrence.
  • If a particular item is repeated K times, then after running the above code, K-1 repetitions will still remain.

But in general, when we say duplicates, we usually refer to all repetitions.

To handle this case, you could modify the above loop to remove all repetitions except one. Instead of using an if conditional to check the count of a particular item, you could run a while loop to repeatedly remove duplications until the count of every item in the list is 1.

The list sweets now contains 2 repetitions of ‘cupcake’ and 3 repetitions of ‘candy’.

sweets = ["cupcake","candy","lollipop","cake","lollipop","candy","cheesecake","candy","cupcake"]

You can use a while loop to remove repetitions, as shown below. The while loop keeps running so long as the count of sweet in sweets is greater than 1. When only one occurrence remains, the condition sweets.count(sweet) > 1 becomes False, and the loop skips to the next item.

for sweet in sweets:
  # check if the count of sweet is > 1 (repeating item)
  while(sweets.count(sweet) > 1):
  # repeatedly remove the first occurrence of sweet until one occurrence remains.
    sweets.remove(sweet)

print(sweets)
# Output
['cake', 'lollipop', 'cheesecake', 'candy', 'cupcake']

But using nested loops may not be very efficient, so you could consider using one of the other techniques discussed if you’re working with large lists.

So far, we have learned the following:

  • Methods to remove duplicate items from Python lists—by creating new lists—containing only unique items
  • Built-in list methods .count() and .remove() to modify the list in place

There are some Python built-in data structures that require the values to be all unique—without repetition. Therefore, we can cast a Python list to one of these data structures to remove duplicates. And then convert them back to a list. We’ll learn how to do this in the upcoming sections.

Cast Python List into a Set to Remove Duplicates

Python sets are collections of elements that are all unique. Therefore, the number of items present in the set (given by len(<set-obj>) is equal to the number of unique elements present.

You can cast any Python iterable into a set using the syntax: set(iterable).

Now, let’s cast the list sweets into a set and examine the output.

set(sweets)
# Output
{'cake', 'candy', 'cheesecake', 'cupcake', 'lollipop'}

From the output in the above code cell, we see that every item appears exactly once, and the duplicates have been removed.

Also, notice that the order of items is not necessarily the same as their order in the original list sweets. This is because, besides being a collection of unique elements, a Python set object is an unordered collection.

Now that we have removed the duplicates by casting the list into a set, we can again convert it into a list, as shown below.

unique_sweets = list(set(sweets))
print(unique_sweets)

# Output
['cake', 'cheesecake', 'candy', 'cupcake', 'lollipop']

Use List Items as Dictionary Keys to Remove Duplicates

Python dictionary is a collection of key-value pairs where the keys uniquely identify the values.

You can create a Python dictionary using the .fromkeys() method with the syntax: dict.fromkeys(keys, values). Here, keys and values are iterables containing the keys and values of the dictionary, respectively.

  • keys is a required parameter, and it can be any Python iterable corresponding to the keys of the dictionary.
  • values is an optional parameter. If you don’t specify the values iterable, the default value of None is used.

Without specifying the values, dict.fromkeys(sweets) returns a Python dictionary where the values are set to None – the default value. The code cell below explains this.

dict.fromkeys(sweets)

# Output
{'cake': None,
 'candy': None,
 'cheesecake': None,
 'cupcake': None,
 'lollipop': None}

As with the previous section, we can again convert the dictionary into a list, as shown below.

unique_sweets = list(dict.fromkeys(sweets))
print(unique_sweets)
# Output
['cupcake', 'candy', 'lollipop', 'cake', 'cheesecake']

From the output above, we can see that the duplicate items have been removed from the list sweets.

Summing Up👩‍🏫

Here’s a recap of the different methods you can use to remove duplicate items or repetitions from Python lists.

  • Use the Python list method .append() to add non-repeating items to a new list. The new list contains each item in the original list exactly once and removes all repetitions. You can also do this using list comprehension.
  • Use built-in .count() and .remove() methods to remove items that occur exactly twice. The same can be placed in a while loop to remove all additional occurrences.
  • Cast a Python list into a set to retain only the unique elements.
  • Use dict.fromkeys(list) to remove any duplicates from the list as there should be no repetition keys of the dictionary.

Next, check out Python projects to practice and learn. Or learn how to find the index of an item in Python lists. Happy learning!