Regex expression can be a pain. Well, sometimes!

Let’s learn about Regular Expressions and their patterns. We are going to look into such patterns that seem like a convoluted soup of characters. We will see what every character in a regular expression means.

After reading this article, you will be able to create your regular expressions and use them for as you like. In the end, we will also list down some of the online RegEx testing tools so that based on requirement you can create your RegEx and test it using these tools.

### Introduction

Regular Expressions or as it’s commonly known – RegEx is any sequence of characters that can be used as a pattern to search for characters or strings.

For example – to determine if a string or phrase contains the word “apple” we can use the regex “/apple” to search within the string. As another example, we can use “`/[0-9]`” to check if a given string contains a number between 0 and 9.

### Regular Expressions and their use

Regular expressions are widely used for a variety of purposes in modern-day web-related operations. Validation of web forms, Web search engines, lexical analyzers in IDE’s, text editors, and document editors are among a few examples where regular expressions are frequently used.

We have all used “`CTRL + F`” many times to search within a document or a piece of code to find a particular word or a phrase or an expression. This operation can be pointed out as a very common example of the use of regular expressions.

Before going on any further, let’s have a look at a very commonly used regular expression.

Can you guess ðŸ¤” the below RegEX what is it used for?

``^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})\$``

Don’t worry if you can’t guess it. I am dam sure you would be able to guess by the end of this article.

First let’s get started with A, B, C of RegEx.

### Tokens

To start with, let’s look at the various symbols in the Regex shown above.

``^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})\$``

If we look at the regex given above, we can see that is composed of many symbols or characters or tokens. Let’s find out what they mean:

### Break down of the given Regex pattern

Now, armed with this preliminary knowledge of tokens, let’s try to decode the above regular expression:

• `<small><span style="color: #ff0000;"><span style="color: #000000;">^([a-zA-Z0-9_\-\.]+)</span></span></small>` means we are looking for a string that starts with at least one or more uppercase or lowercase alphanumeric characters, underscores, hyphens, or dots. For instance, anything that looks similar to user_name.01 will match the pattern. We must remember that here don’t need to include all the symbols just any one character in `[a-zA-Z0-9_\-\.]` will do.
• The @ character matches for a single occurrence of @. Adding to the previous example, something like [email protected] will fit.
• `([a-zA-Z0-9_\-\.]+)`  is similar to the first point. It too means that we are looking for a string that contains at least one or more alphanumeric characters, underscores, hyphens, or dots. Adding to the example, [email protected] will fit here.
• As you might have already guessed, we are hinting at an email pattern. Moving on, \. matches the single “.” character. If we continue with the ongoing example, something like [email protected]
• `([a-zA-Z]{2,5})\$` this means that the string should end with 2 to 5 alphabet characters either uppercase or lowercase. If we add .com to the previous example, we can get [email protected], which is the common pattern of an email string.

Combining all of the above, we can see that we are searching for an email id string. Now we can use this expression to validate any email id. If our test email id matches this pattern we can say it is a valid email id.

P.S. – This a pattern for most common email ids on the web.

### Types of Tokens

Many tokens can be used in various combinations within a Regex to describe a wide variety of expressions. Below we are going to take a look at the various types of tokens that are used in regular expressions. Furthermore, we are also going to look at the most commonly used tokens in each category.

#### Basic Tokens

Let’s start with the basic tokens. These tokens are used with almost every regular expression. Hence, we must learn about them first.

#### Character classes

Moving on, let’s look at the character tokens. They are used to match alphabets, numbers and other special characters.

#### Quantifiers

This special class of tokens is used to match the number of consecutive occurrences of a character or a string or a number. They are used in conjunction with the other tokens.

Let’s look at a few common quantifiers.

#### Groups

These tokens will match in groups as the name suggests.

#### Flags

These are special instructions given to the pattern matcher engine while searching for a match.

#### Anchors

Additional instructions for the engine regarding positions.

### Commonly used regular expressions

Regular expressions are widely used over the Internet. From form validations to looking up data containing a particular keyword or keywords, regular expressions are almost inseparable from modern-day computing applications.

Let’s look at some familiar examples of the use of regular expressions.

#### Matching a phone number

Let’s see what is the pattern of a phone number used in India. The Country Code comes first. It usually contains a “+” character followed by the number 91, which is the country code for India. Also, Indian phone numbers generally start with 6, 7, 8, or 9. This is then followed by 9 other digits.

So a valid regex for an Indian cell phone number would be as given.

``^(\+91[\-\s]?)?[0]?(91)?[6-9]\d{9}\$``

#### Testing the strength of passwords

Most websites recommend us to provide a strong password which contains a combination of numbers, uppercase and lowercase characters, and symbols. Also, there has to be a minimum number of characters – 6 or 8. This is done so that the password becomes very hard to crack.

Any password following this rule can be generated or validated for password strength using a regular expression.

``^(((?=.*[a-z])(?=.*[A-Z]))|((?=.*[a-z])(?=.*[0-9]))|((?=.*[A-Z])(?=.*[0-9])))(?=.{6,})``

#### URL Matching

URLs are the most common way to use the internet and quickly visit the webpage we want. Almost every website has an URL. Hence, every URL is standardized and follows a definite pattern. Every URL either follows the HTTP or the HTTPs protocol followed by “://” and the “www” often. Then the name of the website followed by a .com or .net or .org etc.

To test the validity of an URL we can use a regex like the one given below.

``https?:\/\/(www\.)?[[email protected]:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)``

#### Date and Time formats

Date and time formats are also very commonly used across the web. There are many formats of dates used by a variety of applications or software or systems. Dates should always be used in a format that makes it usable for the user or the application that is trying to read it.

A date in the format dd-MM-yyyy can be validated by using a regular expression which can be as given below.

``^(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/[0-9]{4}\$``

Now, let’s explore some of the online RegEx tools which can be handy to build and troubleshoot.

If you want to learn more about regular expressions, their examples, and advanced usages, here is a list of  websites that you can always refer to:

## Regex101

Regex101 is an excellent reference guide and an interactive tool for creating your regular expressions, it can help you get started with regex very quickly.

Using this we can test RegEx for the below languages.

• PCRE (PHP)
• ECMAScript (JavaScript)
• Python
• GoLang

It provides supports for RegEx functionalities like a match, substitution, and unit tests. Apart from this one can save the old tested RegEx.

## FreeFormatter

FreeFormatter is JavaScript-based and uses the XRegExp library for enhanced features. It facilitates testing a RegEx against a match as well as replacing a match. It supports below flags, which can be used depending upon the requirement while testing a RegEx

• i – Case-insensitive
• m – Multiline
• g – Global (don’t stop at the first match)
• s – Dot matches all INCLUDING line breaks (XRegExp only).

## Regex Crossword

If Regex and puzzles interest you, this is the site to go to. It has a series of fun and interactive puzzles. They will definitely help you learn more about regular expressions.

• Optimized for phones and solving RegEx puzzles on the go.
• A step by step tutorial, teaching you the different symbols and RegEx patterns.
• Bend your mind around cubistic 2D palindrome RegEx puzzles.
• Wide range of RegEx puzzles with difficulties from beginner to expert.

## RegExr

RegExr is a website for getting your hands dirty with Regex. You can write regex, match patterns, and have all the fun with this Codepen equivalent for Regular Expressions.

Features

• Supports JavaScript & PHP/PCRE RegEx.
• Results update in real-time as you type.
• Roll over a match or expression for details.
• Validate patterns with suites of Tests.
• Save & share expressions with others.
• Full RegEx Reference with help & examples.

## Pythex

It is a Python-based regular expression tester. Pythex is a quick way to test your Python regular expressions. It comes with four flags namely

• Ignore Case
• Multiline
• DotAll
• Verbose

## Rubular

Rubular is a Ruby-based regular expression editor. It supports and uses the Ruby 2.5.7 version onwards.

## Debuggex

It is JavaScript-based and supports RegEx for Python and Perl Compatible Regular Expressions(PCRE). Using this online tool we can embed our RegEx to StackOverflow. It provides a facility to share the RegEx result by creating a unique link against each RegEx test.

## ExtendsClass

ExtendsClass is a toolbox for developers. It provides RegEx testing support for the below languages.

• JavaScript
• Python (3.4)
• Ruby (2.1)
• Java (JDK 14)
• PHP (7)

## RegEx Tester

This free regular expression tester lets you test your regular expressions against any entry of your choice and clearly highlights all matches. Using this, we can save the old tested RegEx for future reference. Moreover, it supports JavaScript and PCRE RegEx.

## Web ToolKit

Web Toolkit contains a set of utility tools, RegEx tester is one of them. We can input our RegEx here and can test it against a value. It also provides a facility for replacing, matching, and copying the expressions. Apart from this, it provides a toggle to perform a case-sensitive and global match.

### Learning Resources

If you wish to learn RegEx, here are some of the best courses available online.

#### Coursera

Coursera offers interesting guided project courses which will give you hands-on experience using RegEx. Most of these project courses last about an hour and you will be working step-by-step along with the instructor. Here are some of the best RegEx projects.

#### Udemy

Udemy offers a Complete RegEx course for beginners which teaches you the basics in 3.5 hours and a Python RegEx Course with Projects which will give you hands-on experience using RegEx for input validation, data processing, and transformation.

#### Conclusion

We learned the regular expressions, a few common examples, and some of the online testing tools. With this knowledge, we can create our regular expressions and use them in our applications.