• Get application security done the right way! Detect, Protect, Monitor, Accelerate, and more…
  • Subprocesses let you interact on a totally new level with the Operative System.

    Our computer runs subprocesses all the time. In fact, just by reading this article, you’re running a lot of processes like a network manager, or the internet browser itself.

    The cool thing about this is that any action we do on our computer involves invoking a subprocess. That remains true even if we are writing a simple “hello world” script in python.

    The concept of the subprocess may seem obscure even if you’ve been learning programming for a while. This article will take a deep look at the main concept of the subprocess, and how to use the Python subprocess standard library.

    By the end of this tutorial, you will:

    • Understand the concept of subprocess
    • Have learned the basics of the Python subprocess library
    • Practiced  your Python skills with useful examples

    Let’s get into it

    The concept of subprocess

    Broadly saying, a subprocess is a computer process created by another process.

    We can think of a subprocess as a tree, in which each parent process has child processes running behind it. I know this can be quite confusing, but let’s see it with a simple graphic.

    There are several ways we can visualize the process running on our computer. For example, in UNIX (Linux & MAC) we have htop, which is an interactive process viewer.

    Htop process viewer

    The tree mode is the most useful tool to take a look at the running subprocesses. We can activate it with F5.

    If we take a look closely at the command section, we can notice the structure of the processes running on our computer.

    Htop process structure
    It all starts with /sbin/init which is the command that starts each process on our computer.  From that point, we can see the beginning of other processes like xfce4-screenshoter and the xfce4-terminal (Which conduces to even more subprocess)

    Taking a look at Windows, we have the mythical task manager which results useful when killing those crashing programs on our machine.

    Windows task manager

    Now we have a crystal clear concept. Let’s see how we can implement subprocesses in Python.

    Subprocesses in Python

    A subprocess in Python is a task that a python script delegates to the Operative system (OS).

    The subprocess library allows us to execute and manage subprocesses directly from Python. That involves working with the standard input stdin, standard output stdout, and return codes.

    We don’t have to install it with PIP, since it’s part of the Python standard library.

    Therefore we can start using subprocesses in python just by importing the module.

    import subprocess
    
    # Using the module ....

    Note: To follow along with this article you should have Python 3.5 +

    To check the python version you currently have, just run.

    ❯ python --version
    Python 3.9.5 # My result
    

    In case the Python version you get is 2.x you can use the following command

    python3 --version

    Continuing with the topic, the main idea behind the subprocess library is to be able to interact with the OS by executing any commands we want, directly from the Python interpreter.

    That means we can do whatever we want, as long as our OS allows us (And as long as you don’t remove your root filesystem 😅).

    Let’s see how to use it by creating a simple script that lists the files of the current directory.

    First subprocess application

    First, let’s create a file list_dir.py. This will be the file where we’re going to experiment listing files.

    touch list_dir.py

    Now let’s open that file and use the following code.

    import subprocess 
    
    subprocess.run('ls')

    First, we are importing the subprocess module, and then using the function run which runs, the command we pass as an argument.

    This function was introduced in Python 3.5, as a friendly shortcut to subprocess.Popen. The subprocess.run function allows us to run a command and wait for it to finish, in contrast to Popen where we have the option to call communicate later.

    Talking about the code output, ls is a UNIX command that lists the files of the directory you’re in. Therefore, If you run this command, you’ll get a list of the files present in the current directory.

    ❯ python list_dir.py
    example.py  LICENSE  list_dir.py  README.md

    Note: Take into account that if you’re in Windows, you’ll need to use different commands. For example instead of using “ls”  you can use “dir”

    This may seem too simple, and you’re right. You want to take a full approach to all the power the shell brings you. So let’s learn how to pass arguments to the shell with subprocess.

    For example to list also the hidden files (Those that start with a dot), and also list all the metadata of the files, we write the following code.

    import subprocess
    
    # subprocess.run('ls')  # Simple command
    
    subprocess.run('ls -la', shell=True)

    We’re running this command as a string and using the argument shell. That means we’re invoking a shell at the start of the execution of our subprocess, and the command argument is interpreted directly by the shell.

    However, the use shell=True has many downsides, and the worst are the possible security leaks. You can read about them in the official documentation.

    The best way to pass commands to the run function is to use a list where lst[0] is the command to call (ls in this case)  and lst[n] are the arguments of that command.

    If we do so, our code will look like this.

    import subprocess
    
    # subprocess.run('ls')  # Simple command
    
    # subprocess.run('ls -la', shell=True) # Dangerous command
    
    subprocess.run(['ls', '-la'])

    If we want to store the standard output of a subprocess in a variable, we can do it by setting the argument capture_output to true.

    list_of_files = subprocess.run(['ls', '-la'], capture_output=True)
    
    print(list_of_files.stdout)
    
    ❯ python list_dir.py 
    b'total 36\ndrwxr-xr-x 3 daniel daniel 4096 may 20 21:08 .\ndrwx------ 30 daniel daniel 4096 may 20 18:03 ..\n-rw-r--r-- 1 daniel daniel 55 may 20 20:18 example.py\ndrwxr-xr-x 8 daniel daniel 4096 may 20 17:31 .git\n-rw-r--r-- 1 daniel daniel 2160 may 17 22:23 .gitignore\n-rw-r--r-- 1 daniel daniel 271 may 20 19:53 internet_checker.py\n-rw-r--r-- 1 daniel daniel 1076 may 17 22:23 LICENSE\n-rw-r--r-- 1 daniel daniel 216 may 20 22:12 list_dir.py\n-rw-r--r-- 1 daniel daniel 22 may 17 22:23 README.md\n'

    To access the output of a process, we use the instance attribute stdout.

    In this case, we want to store the output as a string, instead of bytes and we can do so by setting the text argument as true.

    list_of_files = subprocess.run(['ls', '-la'], capture_output=True, text=True)
    
    print(list_of_files.stdout)
    
    ❯ python list_dir.py
    total 36
    drwxr-xr-x  3 daniel daniel 4096 may 20 21:08 .
    drwx------ 30 daniel daniel 4096 may 20 18:03 ..
    -rw-r--r--  1 daniel daniel   55 may 20 20:18 example.py
    drwxr-xr-x  8 daniel daniel 4096 may 20 17:31 .git
    -rw-r--r--  1 daniel daniel 2160 may 17 22:23 .gitignore
    -rw-r--r--  1 daniel daniel  271 may 20 19:53 internet_checker.py
    -rw-r--r--  1 daniel daniel 1076 may 17 22:23 LICENSE
    -rw-r--r--  1 daniel daniel  227 may 20 22:14 list_dir.py
    -rw-r--r--  1 daniel daniel   22 may 17 22:23 README.md

    Perfect, now that we know the basics of the subprocess library, it’s time to move on to some usage examples.

    Usage Examples of subprocess in Python

    In this section, we’re going to review some practical uses of the subprocess library. You can check all of them in this Github repository.

    Program checker

    One of the main usages of this library is the ability to make simple OS operations.

    For instance, a simple script that checks if a program is installed. In Linux, we can do this with the which command.

    '''Program checker with subprocess'''
    
    import subprocess
    
    program = 'git'
    
    process = subprocess. run(['which', program], capture_output=True, text=True)
    
    if process.returncode == 0: 
        print(f'The program "{program}" is installed')
    
        print(f'The location of the binary is: {process.stdout}')
    else:
        print(f'Sorry the {program} is not installed')
    
        print(process.stderr)

    Note: In UNIX when a command is successful its status code is 0. Otherwise, something went wrong during the execution

    Since we’re not using the shell=True argument, we can take the user input securely. Also, we can check if the  input is a valid program with a regex pattern.

    import subprocess
    
    import re
    
    programs = input('Separe the programs with a space: ').split()
    
    secure_pattern = '[\w\d]'
    
    for program in programs:
    
        if not re.match(secure_pattern, program):
            print("Sorry we can't check that program")
    
            continue
    
        process = subprocess. run(
            ['which', program], capture_output=True, text=True)
    
        if process.returncode == 0:
            print(f'The program "{program}" is installed')
    
            print(f'The location of the binary is: {process.stdout}')
        else:
            print(f'Sorry the {program} is not installed')
    
            print(process.stderr)
    
        print('\n')
    

    In this case, we’re getting the programs from the user and using a regex expression that certifies the program string only includes letters and digits. We check the existence of each program with a for a loop.

    Simple Grep in Python

    Your friend Tom has a list of patterns in a text file and another large file in which he wants to get the number of matches for each pattern. He would spend hours running the grep command for every pattern.

    Fortunately, you know how to solve this problem with Python, and you’ll help him to accomplish this task in few seconds.

    import subprocess
    
    patterns_file = 'patterns.txt'
    readfile = 'romeo-full.txt'
    
    with open(patterns_file, 'r') as f:
        for pattern in f:
            pattern = pattern.strip()
    
            process = subprocess.run(
                ['grep', '-c', f'{pattern}', readfile], capture_output=True, text=True)
    
            if int(process.stdout) == 0:
                print(
                    f'The pattern "{pattern}" did not match any line of {readfile}')
    
                continue
    
            print(f'The pattern "{pattern}" matched {process.stdout.strip()} times')
    

     

    Taking a look at this file, we define two variables which are the filenames we want to work with. Then we open the file that contains all the patterns and iterate over them. Next, we call a subprocess that runs a grep command with the “-c” flag (means count) and determine the output of the match with a conditional.

    If you run this file (remember you can download the text files from the Github repo)

    Set up a virtualenv with subprocess

    One of the coolest things you can do with Python is process automation. This kind of script can save you hours of time per week.

    For example, we’re going to create a setup script that creates a virtual environment for us and tries to find a requirements.txt file in the current directory to install all the dependencies.

    import subprocess
    
    from pathlib import Path
    
    
    VENV_NAME = '.venv'
    REQUIREMENTS = 'requirements.txt'
    
    process1 = subprocess.run(['which', 'python3'], capture_output=True, text=True)
    
    if process1.returncode != 0:
        raise OSError('Sorry python3 is not installed')
    
    python_bin = process1.stdout.strip()
    
    print(f'Python found in: {python_bin}')
    
    process2 = subprocess.run('echo "$SHELL"', shell=True, capture_output=True, text=True)
    
    shell_bin = process2.stdout.split('/')[-1]
    
    create_venv = subprocess.run([python_bin, '-m', 'venv', VENV_NAME], check=True)
    
    if create_venv.returncode == 0:
        print(f'Your venv {VENV_NAME} has been created')
    
    pip_bin = f'{VENV_NAME}/bin/pip3'
    
    if Path(REQUIREMENTS).exists():
        print(f'Requirements file "{REQUIREMENTS}" found')
        print('Installing requirements')
        subprocess.run([pip_bin, 'install', '-r', REQUIREMENTS])
    
        print('Process completed! Now activate your environment with "source .venv/bin/activate"')
    
    else:
        print("No requirements specified ...")

     

    In this case, we’re using multiple processes and parsing the data we need in our python script. We’re also using the pathlib library which allows us to figure it if the requirements.txt file exists.

    If you run the python file you’ll get some useful messages of what’s happening with the OS.

    ❯ python setup.py 
    Python found in: /usr/bin/python3
    Your venv .venv has been created
    Requirements file "requirements.txt" found
    Installing requirements
    Collecting asgiref==3.3.4 .......
    Process completed! Now activate your environment with "source .venv/bin/activate"

    Note that we get the output from the installation process because we’re not redirecting the standard output to a variable.

    Run another Programming Language

    We can run other programming languages with python and get the output from those files. This is possible because the subprocesses interact directly with the operative system.

    For instance, let’s create a hello world program in C++ and Java. In order to execute the following file, you’ll need to install C++ and Java compilers.

    helloworld.cpp

    #include <iostream>
    
    int main(){
        std::cout << "This is a hello world in C++" << std::endl;
        return 0;
    }


    helloworld.java

    class HelloWorld{  
        public static void main(String args[]){  
         System.out.println("This is a hello world in Java");  
        }  
    }  


    I know this seems a lot of code compared to a simple Python one-liner, but this is just for testing purposes.

    We’re going to create a Python script that runs all the C++ and Java files in a directory. To do this first we want to get a list of files depending on the file extension, and glob allows us to do it easily!

    from glob import glob
    
    # Gets files with each extension
    java_files = glob('*.java')
    
    cpp_files = glob('*.cpp')

    After that, we can start using subprocesses to execute each type of file.

    for file in cpp_files:
        process = subprocess.run(f'g++ {file} -o out; ./out', shell=True, capture_output=True, text=True)
        
        output = process.stdout.strip() + ' BTW this was runned by Python'
    
        print(output)
    
    for file in java_files:
        without_ext = file.strip('.java')
        process = subprocess.run(f'java {file}; java {without_ext}',shell=True, capture_output=True, text=True)
    
        output = process.stdout.strip() + ' A Python subprocess runned this :)'
        print(output)

    One little trick is to use the string function strip to modify the output and only get what we need.

    Note: Be carefully to run large Java or C++ files since we’re are loading their output in memory and that could produce a memory leak.

    Open external programs

    We’re able to run other programs just by calling their binaries location through a subprocess.

    Let’s try it out by opening brave, my preferred web browser.

    import subprocess
    
    subprocess.run('brave')

    This will open a browser instance, or just another tab if you already have running the browser.

    Opened browser

    As with any other program that accept flags we can use them to produce the desired behavior.

    import subprocess
    
    subprocess.run(['brave', '--incognito'])

    Incognito flag

    To sum up

    A sub-process is a computer process created by another process. We can check the processes our computer is running with tools like htop and the task manager.

    Python has its own library to work with subprocesses. Currently, the run function gives us a simple interface to create and manage subprocesses.

    We can create any kind of application with them because we’re interacting directly with the OS.

    Finally, remember that the best way to learn is to create something you would like to use.