Geekflare is supported by our audience. We may earn affiliate commissions from buying links on this site.
In Development Last updated: June 28, 2023
Share on:
Invicti Web Application Security Scanner – the only solution that delivers automatic verification of vulnerabilities with Proof-Based Scanning™.

Are you looking to add data structures to your programming toolbox? Take the first steps today by learning about data structures in Python.

When you’re learning a new programming language, it’s important to understand the basic data types and the built-in data structures that the language supports. In this guide on data structures in Python, we’ll cover the following:

  • advantages of data structures 
  • built-in data structures in Python, such as lists, tuples, dictionaries, and sets
  • implementations of abstract data types such as stacks and queues.

Let’s begin!

Why Are Data Structures Helpful?

why

Before we go over the various data structures, let’s see how using data structures can be helpful:

  • Efficient data processing: Choosing the right data structure helps to process data more effectively. For example, if you need to store a collection of items of the same data type—with constant look-up times and tight coupling—you can choose an array.
  • Better memory management: In larger projects, for storing the same data, one data structure might be more memory efficient than another. For instance, in Python, both lists and tuples can be used to store collections of data of the same or different data types. However, if you know that you don’t have to modify the collection, you can choose a tuple that takes up relatively less memory than a list.
  • More organized code: Using the right data structure for a particular functionality makes your code more organized. Other developers who read your code will expect you to use specific data structures depending on the desired behavior. For example: if you need a key-value mapping with constant look-up and insertion times, you can store the data in a dictionary.

Lists

When we need to create dynamic arrays in Python—from coding interviews to common use cases—lists are the go-to data structures.

Python lists are container data types that are mutable and dynamic, so you can add and remove elements from a list in place—without having to create a copy.

When using Python lists:

  • Indexing into the list and accessing an element at a specific index is a constant time operation.
  • Adding an element to the end of the list is a constant time operation.
  •  Inserting an element at a specific index is a linear time operation.

There are a set of list methods that help us perform common tasks efficiently. The code snippet below shows how to perform these operations on an example list:

>>> nums = [5,4,3,2]

>>> nums.append(7)
>>> nums
[5, 4, 3, 2, 7]

>>> nums.pop()
7
>>> nums
[5, 4, 3, 2]

>>> nums.insert(0,9)
>>> nums
[9, 5, 4, 3, 2]

Python lists also support slicing and membership testing using the in Operator:

>>> nums[1:4]
[5, 4, 3]

>>> 3 in nums
True

The list data structure is not only flexible and simple but also allows us to store elements of different data types. Python also has a dedicated array data structure for efficient storage elements of the same data type. We’ll learn about this later in this guide.

Tuples

In Python, tuples are another popular built-in data structure. They are like Python lists in that you can index them in constant time and slice them. But they are immutable, so you cannot modify them in place. The following code snippet explains the above with an example nums tuple:

>>> nums = (5,4,3,2)

>>> nums[0]
5

>>> nums[0:2]
(5, 4)

>>> 5 in nums
True

>>> nums[0] = 7 # not a valid operation!
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

So when you want to create an immutable collection and be able to process it efficiently, you should consider using a tuple. If you’d like the collection to be mutable, prefer using a list instead.

📋 Learn more about the similarities and differences between Python lists and tuples.

Arrays

Arrays are lesser-known data structures in Python. They are similar to Python lists in terms of the operations they support, such as indexing in constant time and inserting an element at a specific index in linear time.

However, the key difference between lists and arrays is that arrays store elements of a single data type. Therefore, they are tightly coupled and more memory efficient.

To create an array, we can use the array() constructor from the built-in array module. The array() constructor takes in a string specifying the data type of the elements and the elements. Here we create nums_f, an array of floating point numbers:

>>> from array import array
>>> nums_f = array('f',[1.5,4.5,7.5,2.5])
>>> nums_f
array('f', [1.5, 4.5, 7.5, 2.5])

You can index into an array (similar to Python lists):

>>> nums_f[0]
1.5

Arrays are mutable, so you can modify them:

>>> nums_f[0]=3.5
>>> nums_f
array('f', [3.5, 4.5, 7.5, 2.5])

But you cannot modify an element to be of a different data type:

>>> nums_f[0]='zero'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: must be real number, not str

Strings

In Python strings are immutable collections of Unicode characters. Unlike programming languages like C, Python does not have a dedicated character data type. So a character is also a string of length one.

As mentioned the string is immutable:

>>> str_1 = 'python'
>>> str_1[0] = 'c'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

Python strings support string slicing and a set of methods to format them. Here are some examples:

>>> str_1[1:4]
'yth'
>>> str_1.title()
'Python'
>>> str_1.upper()
'PYTHON'
>>> str_1.swapcase()
'PYTHON'

⚠ Remember, all the above operations return a copy of the string and do not modify the original string. If you’re interested, check out the guide on Python Programs on String Operations.

Sets

In Python, sets are collections of unique and hashable items. You can perform the common set operations such as union, intersection, and difference:

>>> set_1 = {3,4,5,7}
>>> set_2 = {4,6,7}

>>> set_1.union(set_2)
{3, 4, 5, 6, 7}

>>> set_1.intersection(set_2)
{4, 7}

>>> set_1.difference(set_2)
{3, 5}

Sets are mutable by default, so you can add new elements and modify them:

>>> set_1.add(10)
>>> set_1
{3, 4, 5, 7, 10}

📚 Read Sets in Python: A Complete Guide with Code Examples

FrozenSets

If you want an immutable set, you can use a frozen set. You can create a frozen set from existing sets or other iterables.

>>> frozenset_1 = frozenset(set_1)
>>> frozenset_1
frozenset({3, 4, 5, 7, 10, 11})

Because frozenset_1 is a frozen set, we run into errors if we try to add elements (or otherwise modify it):

>>> frozenset_1.add(15)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'frozenset' object has no attribute 'add'

Dictionaries

A Python dictionary is functionally similar to a hash map. Dictionaries are used to store key-value pairs. The keys of the dictionary should be hashable. Meaning the hash value of the object does not change.

You can access the values using keys, insert new items, and remove existing items in constant time. There are dictionary methods to perform these operations.

>>> favorites = {'book':'Orlando'}
>>> favorites
{'book': 'Orlando'}

>>> favorites['author']='Virginia Woolf'
>>> favorites
{'book': 'Orlando', 'author': 'Virginia Woolf'}

>>> favorites.pop('author')
'Virginia Woolf'
>>> favorites
{'book': 'Orlando'}

OrderedDict

Though a Python dictionary provides key-value mapping, it is inherently an unordered data structure. Since Python 3.7, the order of insertion of elements is preserved. But you can make this more explicit by using OrderedDict from the collections module.

As shown, an OrderedDict preserves the order of the keys:

>>> from collections import OrderedDict
>>> od = OrderedDict()
>>> od['first']='one'
>>> od['second']='two'
>>> od['third']='three'
>>> od
OrderedDict([('first', 'one'), ('second', 'two'), ('third', 'three')])
>>> od.keys()
odict_keys(['first', 'second', 'third'])

Defaultdict

Key errors are quite common when working with Python dictionaries. Whenever you try to access a key which has not been added to the dictionary, you will run into a KeyError exception.

But using defaultdict from collections module, you can handle this case natively. When we try to access a key with is not present in the dictionary, the key is added and initialized with the default values specified by the default factory.

>>> from collections import defaultdict
>>> prices = defaultdict(int)
>>> prices['carrots']
0

Stacks

Stack is a last-in-first-out (LIFO) data structure. We can perform the following operations on a stack:

  • Add elements to the top of the stack: push operation 
  • Remove elements from the top of the stack: pop operation 

An example to illustrate how stack push and pop operations work:

Stacks-1
Stacks

How to Implement a Stack Using a List

In Python, we can implement the stack data structure using a Python list.

Operation on StackEquivalent List Operation
Push to stack topAppend to the end of the list using the append() method
Pop off the stack topRemove and return the last element using the pop() method

The code snippet below shows how we can emulate the behavior of a stack using a Python list:

>>> l_stk = []
>>> l_stk.append(4)
>>> l_stk.append(3)
>>> l_stk.append(7)
>>> l_stk.append(2)
>>> l_stk.append(9)
>>> l_stk
[4, 3, 7, 2, 9]
>>> l_stk.pop()
9

How to Implement a Stack Using a Deque

Another method to implement a stack is using deque from the collections module. Deque stands for double-ended queue and supports the addition and removal of elements from both ends.

To emulate the stack, we can:

  • append to the end of the deque using append(), and 
  • pop off the last added element using pop().
>>> from collections import deque
>>> stk = deque()
>>> stk.append(4)
>>> stk.append(3)
>>> stk.append(7)
>>> stk.append(2)
>>> stk.append(9)
>>> stk
deque([4, 3, 7, 2,9])
>>> stk.pop()
9

Queues

Queue is a first-in-first-out (FIFO) data structure. The elements are added to the end of the queue and are removed from the beginning of the queue (head end of the queue) as shown:

queue-1
q

We can implement the queue data structure using a deque:

  • add elements to the end of the queue using append()
  • use the popleft() method to remove element from the beginning of the queue
>>> from collections import deque
>>> q = deque()
>>> q.append(4)
>>> q.append(3)
>>> q.append(7)
>>> q.append(2)
>>> q.append(9)
>>> q.popleft()
4

Heaps

In this section, we’ll discuss binary heaps. We’ll focus on min heaps.

A min heap is a complete binary tree. Let’s breakdown what a complete binary tree means:

  • A binary tree is a tree data structure where each node has at most two child nodes such that each node is less than its child.
  • The term complete means the tree is completely filled, except for, perhaps, the last level. If the last level is partially filled, it is filled from left to right.

Because every node has at most two child nodes. And also satisfies the property that it is less than its child, the root is the minimum element in a min heap.

Here’s an example min heap:

heaps

In Python, the heapq module helps us construct heaps and perform operations on the heap. Let’s import the required functions from heapq:

>>> from heapq import heapify, heappush, heappop

If you have a list or another iterable, you can construct a heap from it by calling heapify():

>>> nums = [11,8,12,3,7,9,10]
>>> heapify(nums)
Heaps1

You can index the first element to check that it is the minimum element:

>>> nums[0]
3

Now if you insert an element to the heap, the nodes will be rearranged such that they satisfy the min heap property.

>>> heappush(nums,1)
python-heap

As we inserted 1 (1 < 3), we see that nums[0] returns 1 which is now the minimum element (and the root node).

>>> nums[0]
1

You can remove elements from the min heap by calling the heappop() function as shown:

>>> while nums:
...     print(heappop(nums))
...
# Output
1
3
7
8
9
10
11
12

Max Heaps in Python

Now that you know about min heaps, can you guess how we can implement a max heap?

Well, we can convert a min heap implementation to a max heap by multiplying each number by -1. Negated numbers arranged in a min heap is equivalent to the original numbers arranged in a max heap.

In the Python implementation, we can multiply the elements by -1 when adding an element to the heap using heappush():

>>> maxHeap = []
>>> heappush(maxHeap,-2)
>>> heappush(maxHeap,-5)
>>> heappush(maxHeap,-7)

The root node—multiplied by -1—will be the maximum element. 

>>> -1*maxHeap[0]
7

When removing the elements from the heap, use heappop() and multiply by -1 to get back the original value:

>>> while maxHeap:
...     print(-1*heappop(maxHeap))
...
# Output
7
5
2

Priority Queues

Let’s wrap up the discussion by learning about the priority queue data structure in Python.

We know: In a queue the elements are removed in the same order in which they enter the queue. But a priority queue serves elements by priority—very useful for applications like scheduling. So at any point of time the element with the highest priority is returned.

We can use keys to define the priority. Here we’ll use numeric weights for the keys.

How to Implement Priority Queues Using Heapq

Here’s the priority queue implementation using heapq and Python list:

>>> from heapq import heappush,heappop
>>> pq = []
>>> heappush(pq,(2,'write'))
>>> heappush(pq,(1,'read'))
>>> heappush(pq,(3,'code'))
>>> while pq:
...     print(heappop(pq))
...

When removing elements, the queue serves the highest priority element (1,'read') first, followed by (2,'write') and then (3,'code').

# Output
(1, 'read')
(2, 'write')
(3, 'code')

How to Implement Priority Queues Using PriorityQueue

To implement a priority queue, we can also use the PriorityQueue class from the queue module. This also uses heap internally.

Here’s the equivalent implementation of the priority queue using PriorityQueue:

>>> from queue import PriorityQueue
>>> pq = PriorityQueue()
>>> pq.put((2,'write'))
>>> pq.put((1,'read'))
>>> pq.put((3,'code'))
>>> pq
<queue.PriorityQueue object at 0x00BDE730>
>>> while not pq.empty():
...     print(pq.get())
...
# Output
(1, 'read')
(2, 'write')
(3, 'code')

Summing Up

In this tutorial, you learned about the various built-in data structures in Python. We also went over the different operations supported by these data structures—and the built-in methods to do the same. 

Then, we went over other data structures such as stacks, queues, and priority queues—and their Python implementation using functionality from the collections module.

Next, check out the list of beginner-friendly Python projects.

  • Bala Priya C
    Author
    Bala Priya is a developer and technical writer from India with over three years of experience in the technical content writing space. She shares her learning with the developer community by authoring tech tutorials, how-to guides, and more…. read more
Thanks to our Sponsors
More great readings on Development
Power Your Business
Some of the tools and services to help your business grow.
  • Invicti uses the Proof-Based Scanning™ to automatically verify the identified vulnerabilities and generate actionable results within just hours.
    Try Invicti
  • Web scraping, residential proxy, proxy manager, web unlocker, search engine crawler, and all you need to collect web data.
    Try Brightdata
  • Monday.com is an all-in-one work OS to help you manage projects, tasks, work, sales, CRM, operations, workflows, and more.
    Try Monday
  • Intruder is an online vulnerability scanner that finds cyber security weaknesses in your infrastructure, to avoid costly data breaches.
    Try Intruder