In Development Last updated:
Share on:
Jira Software is the #1 project management tool used by agile teams to plan, track, release, and support great software.

Interested in learning how Python manages memory internally? If so, this guide to garbage collection in Python is for you.

When programming in Python, you don’t generally have to worry about memory allocation and deallocation. However, it’s helpful to understand how Python handles this under the hood through the process of garbage collection.

In this tutorial, we’ll explore garbage collection and its importance. We’ll then look at how Python uses reference counting to remove objects that are no longer in use

What Is Garbage Collection, And Why Is It Important?

In Python, you may not run into out-of-memory errors often, but it is possible.

When you create objects in a program, they take up memory. And there will likely be many objects that are no longer in use.

What happens if the memory occupied by such objects is never freed? Well, you’ll eventually run out of memory. Enter garbage collection.

Garbage collection is the process that is responsible for automatically identifying and reclaiming memory occupied by objects that are no longer in use—preventing memory leaks and improving the overall efficiency of memory usage.

How Garbage Collection Works in Python?

Now that we understand garbage collection and its significance, let’s see how Python uses garbage collection.

Understanding Variables and Objects in Python

You’re probably used to thinking of variables as containers that hold values. In Python, however, it’s helpful to describe variables as “names” or “labels” for objects rather than containers that hold objects. This is a fundamental concept to understand how variables work in Python. 

Let’s take a closer look.

When you create a variable in Python, you’re essentially creating a reference to an object in memory. The variable is a label or name that points to a memory location where the object is stored. It does not store the object itself, but it “references” or “points to” the object.

Here’s an example:

a = 27

When you create a = 27:

  • An object of type int is created in memory.
  • It takes a value of 27.
  • The variable a points to or references that object. So the reference count associated with the object is 1. (We’ll discuss reference counting in detail in the next section).

Note: You can run hex(id(a)) to get the address in memory of the object that the label a points to.

What happens if you change the value of a? Now a no longer points to the previous memory address.

The variable a now points to an integer object with a value of 7.

Reference Counting

Reference counting is a memory management technique used in Python to keep track of the number of references to an object. 

The idea is to assign a reference count to each object and increment or decrement this count as references to the object are created or deleted. When the reference count drops to zero—indicating that there are no more references to the object—the memory occupied by the object can be reclaimed.

So what is the reference count? As discussed, each Python object has a reference count associated with it. Which is the number of variables pointing to it.

When we create a new reference to an object, the reference count is incremented. When we delete a reference, (a variable goes out of scope or is explicitly set to None), the reference count is decremented.

And how does garbage collection occur? 

  • When the reference count of an object drops to zero, it means there are no more references to that object.
  • The memory occupied by the object is then eligible for reclamation.
  • Python uses a garbage collector to periodically identify and collect objects that are no longer referenced and free up the associated memory.
import sys

a = [1,2,3]
print(sys.getrefcount(a))  # Prints 2 because getrefcount itself creates a reference

b = a # reference count increases by 1
print(sys.getrefcount(a))  # Prints 3

del b # one reference removed
print(sys.getrefcount(a))  # Prints 2

You’ll get the following output:

# Output
2
3
2

How this works is pretty straightforward, but let’s go over the steps:

  • Initially, the reference count of the list [1, 2, 3] is 1.
  • Because there’s a temporary reference to a as the argument to getrefcount(), the count is 2 (one higher than expected).
  • When we assign a to b, the reference count becomes 3.
  • When we delete b, the reference count becomes 2 again.

While reference counting is fundamental to Python’s memory management, it’s worth mentioning that Python also uses cyclic reference detection to handle more complex cases, such as circular references (more on this later).

Python’s Built-in gc Module

Let’s now learn about the gc module for garbage collection.

You can explicitly trigger the garbage collector in your Python script like so:

import gc

class MyClass:
    def __init__(self, name):
        self.name = name

# Create some objects
obj1 = MyClass('Object 1')
obj2 = MyClass('Object 2')
obj3 = MyClass('Object 3')

# Get all objects tracked by the garbage collector
all_objects = gc.get_objects()

# Print information about each object
for obj in all_objects:
    if isinstance(obj, MyClass):
        print(f'{obj.name}:{obj}')

# Manually trigger garbage collection
gc.collect()

Using get_objects() from the gc module will give you the list of all objects known to the garbage collector. Because this list can be prohibitively large, we’ve tried to get information only on objects belonging to MyClass.

# Output
Object 1:<__main__.MyClass object at 0x7f96e7421510>
Object 2:<__main__.MyClass object at 0x7f96e74219d0>
Object 3:<__main__.MyClass object at 0x7f96e7421990>

Note: You can use gc.disable() to manually disable automatic garbage collection. But it is not recommended unless you need granular control over your script.

Cyclic References

Cyclic (or circular) references occur when a group of objects references each other in a way that forms a cycle. 

If no external references exist to this group, it can create memory leaks. Because the reference count for each object never reaches zero, and the objects are never garbage collected.

Here’s a simple example of a circular reference:

# Create an empty set
my_set = set()

# Add circular reference
my_set.add(my_set)

Now the my_set contains a reference to itself, creating a circular reference. In addition to reference counting, Python supports cyclic garbage collection to detect such circular references.

In addition, it uses a generational approach to garbage collection as well.

Generational Approach to Garbage Collection

Generational garbage collection in Python is based on the observation that most objects in a program have a short lifespan. Objects are categorized into different generations based on how long they have been alive—how many collection sweeps they’ve survived—and the garbage collector applies different collection strategies to these generations.

Python’s generational garbage collection typically divides objects into three generations:

#1. Generation 0

  • Newly created objects start in this generation.
  • Objects that survive a garbage collection cycle in this young generation move to the next older generation.

#2. Generation 1

  • Objects that survive several garbage collection cycles in the young generation move to this middle generation.
  • Garbage collection is less frequent in this generation as compared to generation 0.

#3. Generation 2

  • Objects that survive numerous garbage collection cycles in Generation 1 move to Generation 2, the oldest generation.
  • Garbage collection in this generation is even less frequent.

Generational garbage collection is, therefore, based on the generational hypothesis, which states that young objects are more likely to become garbage than older objects.

By collecting the young generation more frequently and the older generations less frequently, the garbage collector can achieve better performance.

Advantages of Garbage Collection

We’ve already discussed why garbage collection is important. Let’s restate the advantages:

  • Garbage collection automates the process of reclaiming memory occupied by objects that are no longer in use, relieving developers from manual memory management.
  • It helps prevent memory leaks by identifying and cleaning up unreachable objects.
  • Garbage collection makes Python applications more robust, reducing the risk of crashes and unexpected behavior due to memory issues.

Best Practices for Garbage Collection in Python

Let’s list some best practices to leverage automatic garbage collection while also writing efficient Pythonic code:

Use Automatic Garbage Collection: Python has built-in automatic garbage collection, so avoid manually managing memory in most cases. Avoid disabling the garbage collector unless absolutely necessary.

Use Context Managers for Resources: When working with external resources like files or network connections, use context managers in with statements to ensure resources are properly closed and released. This reduces the chances of resource leaks.

Monitor and Profile Memory Usage: Use the gc and tracemalloc modules to monitor and profile memory usage in your Python applications to identify performance bottlenecks and potential memory leaks.

Optimize Memory Usage: Minimize the creation of unnecessary objects, and consider using suitable built-in data structures and memory-efficient iterators like generators. Additionally, use appropriate data types to avoid unnecessary memory overhead.

Conclusion

This article explained how garbage collection and memory management work in Python. Let’s review what we’ve learned. 

We learned how Python uses reference counting to remove references to objects that are no longer in use. Then, we looked at how Python handles cyclic references and the generational approach to garbage collection.

We then went over the advantages of garbage collection and wrapped up by discussing some of the best practices for garbage collection in Python.

Next, check out this tutorial on built-in data structures in Python.

Share on:
  • Bala Priya C
    Author
    Bala Priya is a developer and technical writer from India with over three years of experience in the technical content writing space. She shares her learning with the developer community by authoring tech tutorials, how-to guides, and more….
  • Rashmi Sharma
    Editor

    Rashmi is a highly experienced content manager, SEO specialist, and data analyst with over 7 years of expertise. She has a solid academic background in computer applications and a keen interest in data analysis.


    Rashmi is…

Thanks to our Sponsors

More great readings on Development

Power Your Business

Some of the tools and services to help your business grow.
  • The text-to-speech tool that uses AI to generate realistic human-like voices.

    Try Murf AI
  • Web scraping, residential proxy, proxy manager, web unlocker, search engine crawler, and all you need to collect web data.

    Try Brightdata
  • Monday.com is an all-in-one work OS to help you manage projects, tasks, work, sales, CRM, operations, workflows, and more.

    Try Monday
  • Intruder is an online vulnerability scanner that finds cyber security weaknesses in your infrastructure, to avoid costly data breaches.

    Try Intruder