In this tutorial, you’ll learn how to use Python’s split() method to split a string into a list of strings.
When working with Python strings, you can use several built-in string methods to obtain modified copies of strings, such as converting to uppercase, sorting a string, and more. One such method is
.split() that splits a Python string into a list of strings, and we’ll learn more about it by coding examples.
By the end of the tutorial, you will have learned the following:
- how the
- how to customize the split using the
Syntax of the split() Method in Python
Here’s the general syntax to use Python’s
split() method on any valid string:
string.split(sep, maxsplit) # Parameters: sep, maxsplit # Returns: A list of strings
Here, string can be any valid Python string.
maxsplit parameters are optional.
- sep denotes the separator on which you’d like to split the string. It should be specified as a string.
- maxsplit is an integer that specifies how many times you want to split the string.
Their default values are used when you don’t provide optional parameters.
- When you don’t provide the
sepvalue explicitly, whitespace is used as the default separator.
- When you don’t specify the value for
maxsplit, it defaults to -1, which means that the string will be split on all occurrences of the separator.
Phrasing the syntax in plain language:
split()method splits a string
maxsplitnumber of times on the occurrence of separator specified by the parameter
Now that we’ve learned the syntax of the Python split() method let’s proceed to code some examples.
Split a Python String into a List of Strings
If you have Python 3 installed on your machine, you can code with this tutorial by running the following code snippets in a Python REPL.
To start the REPL, run one of the following commands from the terminal:
$ python $ python -i
▶️ You can also try out these examples on Geekflare’s Python editor.
In this example py_str is a Python string. Let’s call the .split() method on py_str without any parameters and observe the output.
py_str = "Learn how to use split() in Python" py_str.split() # Output ['Learn', 'how', 'to', 'use', 'split()', 'in', 'Python']
As seen above, the string is split on all occurrences of whitespace.
Split a Python String on the Occurrence of Separators
#1. As a first example, let’s split the string
py_str with double underscores (__) as the separator.
py_str = "All__the__best" py_str.split(sep='__') # Output ['All', 'the', 'best']
#2. Let’s take another example. Here,
py_str has three sentences, each terminated by a period (.).
py_str = "I love coding. Python is cool. I'm learning Python in 2022" py_str.split(sep='.') # Output ['I love coding', ' Python is cool', " I'm learning Python in 2022"]
▶️ When we call the
.split() method on this string, with
‘.’ as the separator, the resultant list has three sentences, as seen in the above code cell.
#3. Let’s ask a few questions:
- What happens when the separator never occurs in the string?
- How will the split occur in this case?
Here’s an example:
We try to split
py_str on the occurrence of asterisk (*)—which does not occur.
py_str = "This line contains no asterisk." py_str.split(sep='*') # Output ['This line contains no asterisk.']
As no split can be done in this case, the resultant list contains the entire string.
In the next section, we’ll see how we can use the
split() method on the contents of a text file.
Split the Contents of a Python File
When working with text files in Python, you may have to split the file’s contents—based on a separator—for easier processing.
Here’s a sample text file:
The code snippet below shows how to use split on the contents of the sample text file.
with open('sample.txt') as f: content = f.read() str_list= content.split(sep='...') for string in str_list: print(string,end='')
The above code does the following:
- Uses the
withcontext manager to open and work with the text file ‘sample.txt’.
- Reads the contents of the file using the
.read()method on the file object
- Splits the content on the occurrence of the separator ellipsis (…) into a list
- Loops through
str_listto access each string and prints it out.
Here’s the output.
# Output This is a sample text file It contains info on Getting started with <a href="https://geekflare.com/pcap-certification/">programming in Python</a> According to the 2022 StackOverflow Developer Survey Python is one of the most-loved programming languages So what are you waiting for? Start learning!
As an exercise, you can try splitting the contents of a text file on any separator of choice.
Split a Python String into Chunks
When you split a string once, you’ll get two chunks; splitting it twice will get three.
📋 In general, when you split a string
Ktimes, you’ll get
K + 1chunks.
This is illustrated below.
#1. We set
maxsplit equal to 1. We haven’t specified a separator, so the split will occur on whitespaces by default.
py_str = "Chunk#1 I'm a larger chunk, Chunk#2" py_str.split(maxsplit=1) # Output ['Chunk#1', "I'm a larger chunk, Chunk#2"]
Even though the second chunk in the list contains whitespaces, the split does not occur because the split is now controlled by the
maxsplit value of one.
#2. Let’s increase the
maxsplit value to 2 and observe how the split occurs for the following example.
py_str = "Chunk#1 Chunk#2 I'm one large Chunk#3, even though I contain whitespaces" py_str.split(maxsplit=2) # Output ['Chunk#1', 'Chunk#2', "I'm one large Chunk#3, even though I contain whitespaces"]
As with the previous example, the maxsplit value decides the number of splits made. We get three chunks, splits after whitespace’s first and second occurrences.
#3. What happens if you set
maxsplit to a value greater than the number of occurrences of the separator?
In the following code cell, we set
maxsplit it to 8 when the string contains only four commas.
py_str = "There, are, only, 4, commas" py_str.split(maxsplit=8) # Output ['There,', 'are,', 'only,', '4,', 'commas']
Here, the split method splits
py_str on all four occurrences of a comma. Even if you try setting maxsplit to a value less than -1, say, -7, the split will be done on all occurrences of the separator.
Next, let’s put together all that we have learned and use both the
Split a Python String into Chunks on a Separator
#1. Suppose we need to split the string
py_str into three chunks on the occurrence of comma (,). To do this, we can set the
sep value to ‘,’ and
maxsplit value to 2 in the method call.
py_str = "Chunk#1, Chunk#2, I'm one large Chunk#3, even though I contain a ," py_str.split(sep = ',',maxsplit=2) # Output ['Chunk#1', ' Chunk#2', " I'm one large Chunk#3, even though I contain a ,"]
As seen in the output, the split occurs twice on the first two occurrences of the separator.
#2. The separator
sep does not always have to be a special character. It can be a sequence of special characters, like the double underscores we used earlier, or it could even be a substring.
Let us set the string ‘learn’ as the
sep argument and see how the split occurs for varying values of
maxsplit. Here, we set
maxsplit to 2.
py_str = "You need to learn data structures, learn algorithms, and learn more!" py_str.split(sep = 'learn',maxsplit=2) # Output ['You need to ', ' data structures, ', ' algorithms, and learn more!']
#3. If you’d like to split
py_str on all occurrences of the string ‘learn’, we can call this
.split() method by setting
sep = 'learn'—without the
maxsplit parameter. This is equivalent to explicitly set the
maxsplit value to -1, as shown in the code cell below.
py_str = "You need to learn data structures, learn algorithms, and learn more!" py_str.split(sep = 'learn',maxsplit=-1) # Output ['You need to ', ' data structures, ', ' algorithms, and ', ' more!']
We see that the split occurs in all occurrences of ‘learn’.
I hope you’ve now understood how to use the
.split() method with Python strings.
Here’s a summary of this tutorial:
- Python’s built-in .split() method splits a string into a list of strings.
- Use string.split() to split the string on all occurrences of the default separator, whitespace.
- Use string.split(sep,maxsplit) to split the string maxsplit number of times on the occurrence of separator sep. The resultant list has maxsplit+1 items.
As a next step, you can learn how to check if Python strings are palindromes or anagrams.
- Tagged in:
More great readings on Development
Understanding if __name__==’__main__’ in PythonBala Priya C on August 12, 2022
Best Python Libraries for Data ScientistsNeema Muganga on August 12, 2022
26 Awesome Open Datasets for Your Data Science/ML ProjectsBipasha Nath on August 12, 2022
Firmware vs. Software: Similarities and DifferencesAmrita Pathak on August 11, 2022
MySQL Workbench: An IntroductionBipasha Nath on August 10, 2022
PostgreSQL vs MySQL: Differences and SimilaritiesBipasha Nath on August 9, 2022