The COUNTIF function and Data cleaning function are the two most straightforward ways to find duplicates in Google Sheets.
About Google Sheets
Google Sheets is a free, cloud-based spreadsheet application offered by Google as part of its complete office suite to compete with Microsoft Office. Google Sheets can prepare data, do calculations, and more.
It is utilized for modifying, organizing, and storing data. The Google Docs productivity package incorporates this program, consisting of Google Drive, Google Sheets, and Google Slides.
However, one of the benefits of Google Sheets is that, unlike Excel, it can be used collaboratively in real time with other users.
Google Sheets has simplified creating databases by offering a user-friendly interface and practical tools for organizing data.
Utilizing Google Sheets can increase productivity and save time. Users no longer worry about losing crucial data or unintentionally overwriting files because the framework is cloud-based.
One of the most widely used spreadsheet and database programs is Google Sheets, which is accessible on all popular desktop and mobile operating systems.
How Do Google Sheets Help to Create Databases Simpler?
Traditionally, creating databases has been a challenging and time-consuming operation. However, recent developments in technology, particularly Google Sheets, have greatly simplified this procedure. The program also has a lot of capabilities, like the ability to add charts, formulas, and images.
For real-time data management and storing, most businesses utilize it in place of other expensive databases like PostgreSQL, MySQL, SQL Server, etc. Most tiny data sets prefer Google sheets as an alternative.
Google Sheets are not a comprehensive database management system for your company. However, unlike other database management systems, it works wonders for small enterprises and projects to manage their firm financial operations effectively.
Importantly, it has several sophisticated capabilities that are accessible for free, like Google Sheets API, Apps script, Form triggers, Timed triggers, and ImportXML.
Users of the app script can connect to different databases because the app script includes a JDBC service for connecting to MySQL, Microsoft SQL Server, and other databases.
Companies can reduce their expenditures on various data management system products by using all of their capabilities.
Now, let’s discuss the impact of duplicate values in the database.
Impact of Duplicate Values in the Database
If the data is being used for analytics or reporting purposes, duplicate values in a database might negatively affect the data.
In rare circumstances, duplicate values might skew the data, which can result in inaccurate inferences being made from the data. In other instances, duplicating values may make the data more challenging to use and comprehend.
In either case, it’s critical to be knowledgeable about the potential effects of duplicate values in a database and to take precautions to prevent or lessen such effects.
Highlight Duplicates in Google Sheets in Single Column
This section will look at methods to draw attention to data duplication in Google sheets and to eliminate them.
The common formula used to highlight duplicates is:
Here is a step-by-step guide on how to use this formula in Google Sheets.
Navigate to the “Format” option in the top menu and scroll down to “Conditional Formatting” as shown below.
The new conditional formatting options will appear on the right-hand side after clicking the conditional formatting button, as seen below:
After defining the cell range where you wish to highlight duplicates, scroll down to the Format rules section and choose the “Custom formula is” option from the drop-down menu, as shown in the example above.
In this field, enter the formula:
The G cell must be substituted with the appropriate column based on your need to highlight duplication.
After hitting the done button, the duplicates from the column will be highlighted, as seen in the image above.
Through the formatting style settings, as illustrated below, you can change the highlight color or other formatting options:
We have seen how to highlight duplication from a single column in the above example.
In the below section, we will see how to highlight duplicates from multiple columns.
Highlight Duplicates in Google Sheets in Multiple Columns
All phases in this process will be the same as discussed above, except the formula and the range of cells we define to highlight the duplicates.
To better grasp it, let’s use one example.
Navigate to the “Conditional Formatting” option from the “Format” Navigation menu to run the formula. The options will be visible on the right side, as illustrated in the image below.
A cell range is specified in the “Apply to range” section
In the “Format rules” section, the formula used to highlight duplicates from multiple columns is:
It’s a very straightforward formula and easy to implement. The formula contains only a range of cells from which we have to identify the duplicates.
After executing this formula, duplicates from four columns are highlighted, as seen in the example below.
In this illustration, we’ve also changed the highlight color to blue.
We shall examine how to highlight duplicate data from rows in the following section.
Highlight Duplicates in Google Sheets in Rows
In a Google sheet, identifying duplicate rows of data is a little different from highlighting duplicate columns.
We won’t repeat every step as they are all the same except the formula.
Let’s take one example.
The cell range from which you want to highlight the duplicates is listed in the ‘Apply to range’ portion of the conditional format options, as shown in the picture below.
In the “Custom formula is” section, the formula entered is
The seemingly complicated and lengthy formula involves the start and finish of cell numbers.
The Google sheet below shows that the first Apple and Orange entries are skipped, and the last ones are highlighted.
We’ve seen how to highlight duplicate data in rows, columns, and with specific criteria up to this point.
We’ll quickly explore how to remove duplicate data using shortcuts in the following section.
Shortcut to Remove Duplicates
Choose the database or column for which you wish to delete duplicate records, select “Data Cleanup” from the menu at the top, scroll down to “Remove Duplicates”, and click OK, as shown in the image below.
You will see a pop-up box to choose alternatives and approve the removal of duplicates after clicking “Remove duplicates”, as shown in the picture below.
There is a selection for “Data as header row,” as shown in the image below. Check this box if your sheet has a header row to have Google Sheet ignore it while deleting duplicates.
The duplicates will be eliminated when you confirm your selections and click the “Remove duplicates” button, displaying the status below.
How to Create a Unique list?
Let’s extract a distinct list from a database with numerous duplicates.
The formula is incredibly straightforward to understand and apply.
To understand this, let’s take one of our earlier Google Sheets.
The syntax to create the unique list is
=unique(range start: range end)
As seen in the image below, the formula is set in column D, the cell range from which we want to extract the unique data is defined, and after it has been executed, the result is shown in column D.
How to Delete Whitespace in Google Sheets?
When working on a Google Sheet with text or numeric data, you frequently encounter excessive spaces in the cells, which is annoying.
In this part, we’ll look at how to get rid of those additional whitespaces.
Let’s Use the Google Sheet below to eliminate excessive whitespace.
Choose the column or set of data that has additional whitespaces, go to the “Data menu” at the top, scroll down to “Data cleanup”, and click “Trim Whitespace,” as displayed below.
When you select “Trim Whitespaces”, you will see the outcome as indicated in the picture below. As you can see, the column is now free of additional whitespace.
Data duplication is not only a needless waste of time, but it also has the potential to result in significant problems and costly errors.
If your Google sheet is small in size, you can find them manually, but if you have numerous sheets, it would take a lot of time and be error-prone. The techniques above will help you identify duplicates in your Google document and remove them all at once.
I have been in the IT industry for more than 20 years. I have served Fortune 100 company in a senior role looking after projects related to US Banking, BFS, Mortgage, Insurance, and FMCG. Apart from IT, I like to read books, especially spirituality… read more
Cloud data warehouses provide a scalable solution for storing and analyzing data in real-time. They enable businesses to reduce costs and improve efficiency by eliminating the need for on-premises hardware and maintenance. Here is our list of the best cloud data warehouse platforms.