How to Remove Duplicates in HubSpot & Duplicate Management Strategies
Media Junction- HubSpot
- August 7, 2023
While the frequency of duplicate Contacts or Companies in a HubSpot portal can vary, most portals have had a duplicate at one point or another. There are a variety of factors that impact this, including: the size of the database, the data entry process, and the level of data hygiene practices implemented.
Luckily, HubSpot provides the tools to help identify and merge duplicate records, ensuring data accuracy and improving overall CRM efficiency.
In this blog post, we'll dive into:
- How Are Contacts Created in HubSpot?
- Why Do Companies and Contacts Duplicate?
- Why Duplicate Records Are Bad
- How to Minimize Company and Contact Duplication
- Navigating the Manage Duplicate Tool in HubSpot
- What Happens When I Merge a Record?
- Choosing Criteria for Merging
- Tools for Mass Duplicate Management
How Are Contacts Created in HubSpot?
There are a variety of ways that a Contact can enter a HubSpot portal:
- Manual entry: Contacts can be manually entered into the HubSpot CRM by users.
- Form submissions: Contacts can enter the portal by submitting forms on your website or landing pages that are integrated with HubSpot.
- Import: Contacts can be imported into the portal using CSV files or other supported formats.
- API integration: Contacts can be synced with the HubSpot CRM through API integrations with other systems.
- HubSpot Sales extension: Contacts can be added to the portal through the HubSpot Sales extension, which allows users to track and manage contacts directly from their email inbox.
Regardless of your choice, adding an email address to your Contact record is always best practice for several reasons. First, it makes that Contact actionable for Sales reps and Marketers in your portal. Without an email address or means of getting ahold of someone, what purpose is that Contact serving?
Second, HubSpot uses email addresses to associate Contacts to Companies automatically if they share a domain. So, mary.smith@google.com would automatically associate to Google.com.
Thirdly, email addresses serve as a unique identifier for Contacts--kind of like an ID card. HubSpot knows that there should only be one instance of each email address in the system. Given this rule, if you import mary.smith@google.com into your HubSpot portal and this Contact already exists, HubSpot will add any information you're importing to the pre-existing Contact record.
Why Do Companies and Contacts Duplicate?
Even though HubSpot automatically deduplicates Contacts that share the exact same email address, there's still a high chance that duplication will occur. Why's that, you ask?
Simply put: any typos, misspellings, or domains from different departments can result in a whole new Contact--and Company--in your HubSpot portal.
See, when you create a Contact with a Company email address, HubSpot will automatically create a Company for that Contact. However, that means when a Contact comes in with typos in their email address, a Company could also be created with the errors.
In some circumstances, a Contact might even use their personal email address in addition to their work address, to make things even more confusing. If you haven't turned off this setting, a Company could still be created in this instance.
Needless to say, it's pretty easy for duplicate Contacts and Companies to crop up--from the Contact's data entry or your own. But why is this a bad thing?
Why Duplicate Records Are Bad
Duplicate records can lead to serious problems for your company in the long term.
To begin with, incorrect numbers of Contacts and Companies in your reports can lead to your team making misguided decisions based on unreliable data.
According to Harvard Business Review, the cost of bad data is about $3.1 trillion per year just in the United States alone.
Next, if your HubSpot portal contains two or more versions of a Contact, it can result in them being assigned to multiple sales representatives. This can lead to both representatives reaching out to the same person about the same topic. Of course, this is a quick way to irritate a potential lead and slow down your team’s productivity.
Furthermore, having various pieces of information scattered across multiple duplicate records makes it challenging to gain a comprehensive understanding of your lead's intentions, their position in the buying cycle, and the most effective way to communicate with them.
The beauty of HubSpot is that it acts as a hub, containing all of the personalized history and activity related to your Contacts and Companies. The best way to harness this power and get the most out of the tool is to have a clean, deduplicated database.
How to Minimize Company and Contact Duplication
Given the wide variety of ways that data can enter your HubSpot portal, it can be daunting to attempt to prevent the creation of duplicates. Let's focus on a few key strategies to adopt with your team.
Manual Entry
When manually entering Contacts, be mindful to spell the Contact's address and the email domain correctly. While HubSpot will show an error if you don't include the "@" symbol, errors like "@gmal.com" unfortunately do go through.
HubSpot does show a helpful error when entering a full email address that's already in your portal. As you can see in the image below, if abby@gmail.com is already in the system, HubSpot will not allow another version of it to be created manually.
Import
Importing Contacts into HubSpot also requires an eagle eye. If you are importing a spreadsheet, review your file to confirm that:
- Email addresses are spelled correctly (no @gmal.com, for example)
- Your import file is set up properly. This article goes into more detail.
Another option in the quest for accurate email addresses is to invest in an application that verifies emails, like Neverbounce. This allows software to fish out any errors in your email spelling while also identifying emails that will inevitably bounce. Neverbounce and similar platforms work both during import and on an ongoing basis to keep your portal clean. Check out the HubSpot App Marketplace to find applications with a HubSpot integration.
Form Submissions
Form submissions are a way for leads to raise their hand in the customer journey and get in touch with your company. While we appreciate this information, sometimes duplicates or dirty data can result from a manual form submission. Let's review a couple settings to help avoid this.
Email domains to block
Depending on your business strategy, you may or may not want users to input their personal emails into your system. This setting, accessed by selecting the email field while editing a form, can block either specific email domains and/or free email providers. This can help you avoid duplicates by both a Contact's personal and business emails existing in your portal.
Always create contact for new email address
In the Options area of a Form, you can toggle on or off a setting reading: Always create contact for new email address. By default, this setting is off--and we recommend keeping it that way (unless you have a unique use case.)
As you may know, HubSpot "cookies" users when they submit forms on your website via the HubSpot Tracking Code. If a cookied Contact submits a form with one email and then submits a second form with a second email, HubSpot will merge these Contacts automatically if it detects the same cookies or IP address. Both of these email addresses will then appear on the Contact's record for your reference.
Create and associate Companies with Contacts
HubSpot can automatically associate Contacts with Companies by matching the domain in a contact's email value to the company's Company domain name value. For example, a contact with the email address "sandra@example.com" will be associated to the company with the Company domain name "example.com."
With this setting turned on, if a company record doesn't already exist, HubSpot will automatically create a company record. Great! Except when you're a business that accepts a lot of freemium email accounts (e.g., gmail.com, yahoo.com). This can result in the creation of Companies for the freemium email domain, and you'll have 20+ unrelated Contacts connected to the Yahoo Company record, for example.
Located in the Company settings area, this setting does allow you to add a list of domains that you would like to exclude from automatic association. You can choose to add freemium email providers to this list, and any others that make sense to your business, to avoid the creation of these Companies.
Navigating the Manage Duplicate Tool in HubSpot
In your HubSpot account, go to Contacts (if you want to deduplicate Contacts by email address) or to Company (if you want to deduplicate Companies by domain). Either option will show you the button titled “Actions” on the top right. Click Actions > Manage Duplicates. If you're not seeing it, double check your hub tier. This tool is available for Professional and Enterprise tiers of HubSpot.
This will take you to a new screen displaying potential pairs of duplicates. To merge them, click Review > Merge. If you don’t think these pairs are the same Contact or the same Company, click Dismiss.
But before you click Merge... there are a few things you want to consider when choosing which record to “keep."
What Happens When I Merge a Record?
When you merge a record, you are basically combining the information of both records, but the one you choose to keep is going to be your primary record.
Let’s look at an example:
company.com and the misspelled version of the same company (coompany.com) both have the same information. Since we know the first option is the correct one, we choose the domain company.com to keep, and merge coompany.com into it.
Now the primary domain is company.com, and the second domain won’t be used anymore in HubSpot. Contacts associated with both domains will be displayed on the same company record, and all history activity as well, such as notes, emails, and calls.
- The most recent value for each property (e.g., "Company Name") will be used for the new record.
- Timeline activity for both Company records will be preserved in the new record
- If the two records show different lifecycle stages, the stage furthest down the funnel is maintained.
- If there are two different owners assigned to duplicate Companies, that one belonging to the record you choose to keep will remain.
Choosing Criteria for Merging
If your duplicated records don’t have much activity, choosing a record to keep can be an easier decision. You can consider criteria such as:
- Keep the one with the most recent activity, signaling that this is the record in use
- Keep the most recently created record for potentially more updated information
- Keep the Company record with more Contacts associated
Those are all decisions you make with your team after knowing what has been imported to your database and knowing your clients.
However, if both duplicated records have activity and different owners and deal stages, it’s time to make a decision. Look into both records and decide which one should be the primary record. Remember that the activity and associated Contacts won’t be lost, but combined into the one record.
Another common situation is to find several records for the same company with slightly different domains representing different divisions inside an organization. Here, you can decide to have one general domain to represent the Company and have all Contacts from different departments associated with this main domain. Or, you might consider that you may be doing business with different people from different offices, or in different countries, and you don’t want to mix up this information. This may also be a case for defining a Parent-Child Relationship strategy.
Choosing criteria can take a little time, but armed with the discussions you've had with your colleagues and your own knowledge of your clients, you're sure to make a smart decision.
Set Up Properties to Review
Once you have decided the best criteria to merge your records, edit the properties to review, or properties visible when comparing records, to help speed up the process of analyzing your Contacts or Companies.
In the Manage Duplicates area in HubSpot, click on Edit Properties to Review on the top left:
There you will be able to choose which properties appear in the Review tab. When you click Review, a pop up screen will show you the pair of records with a few properties, such as email, last activity date, create date. If those three properties are enough information for you to choose a record to keep, that's great!
However, depending on your business, other properties may be more helpful. Here are a few good properties to consider including for your reference:
Contact Owner/Company Owner
Checking if the pair of duplicates has the same owner is a good idea. If they have different owners and you merge, the one you chose to keep will override the other one. This might confuse your team, so take this into consideration.
Create Date
It helps you find patterns in the duplication. Sometimes one single import caused the whole problem.
Last Activity Date
This is a good reference for knowing the record to keep, the one with activity or most recent activity.
Number of Associated Deals
See if both records have deals. This is similar to the Contact owner issue if different team members own those deals.
Marketing Emails Opened
If you have two records of the same person, keep the one that has the email address where they actually open the emails.
Marketing Emails Delivered
If the emails are not being delivered, it's possible that this email address is invalid. Look at the record and see if you can update the email address or choose to keep the record that receives emails.
Include any other custom property relevant to your business that you want to make sure not to override. Sometimes you might still want to click on both records to look for more details, even after adding more properties in the review tab.
Tools for Mass Duplicate Management
Now, after all of these tips, you may be thinking to yourself: is there any way to expedite this manual process? And yes, there is! Let's review a couple tools that can help you manage duplicates on a larger scale.
HubSpot Operations Hub
The standard Manage Duplicates tool access allows users to review and select records one at a time. With HubSpot Operations Hub Professional or Enterprise, you can select multiple records at once to merge based on the same criteria: Oldest engagement, Most recent engagement, Created first, Created last, and most recently updated. These criteria will then apply to all records you have selected, and these records will merge according to that parameter.
This tool allows you to select up to 50 records at a time, which saves you time while making sure you still have control over the merging. You're also able to reject merge suggestions en masse.
HubSpot Integrations Like Insycle or Dedupley
There are also a number of integrations in the HubSpot App Marketplace that help HubSpot users define criteria for automatic merging, either on a one-time or ongoing basis. Examples of applications with this functionality include Insycle or Dedupley.
The premise of these tools is similar to that of HubSpot's Manage Duplicates tool: select a property that you would like to "prefer" in the merge. However, they also offer additional features, such as:
- Specifying a filter for your merge. For example, only review Contacts created before 2021.
- Creating a priority list for properties to "prefer" in the merge. If a record doesn't have the first property, the system automatically moves onto the next one.
- Allowing you the option to retain specific properties from the secondary record in the merge, even if you selected the other record as the primary.
- Sending you a report of the merging that occurred; specifying HubSpot record IDs if you need to cross-reference.
Parent-Child Relationships
A parent-child relationship in your HubSpot CRM is when you relate two Companies to each other. Whether you work with franchises, distributors, school branches, or similar, this function enables you to navigate between related Companies quickly and easily, saving you time and unnecessary headache. This organization can also be helpful for reporting.
It's important to note that a record with a parent-child relationship cannot be merged in HubSpot. These records are displayed on the bottom right of a Company record as “Related Companies.” If you wish to merge a company that is marked as related to another company, you will need to remove the parent-child relationship first. Once all your merging is complete, you can choose to re-assign parent-child relationships as needed.
Additional Recommendations for Removing Duplicates in HubSpot
- Removing duplicates from HubSpot can be an intense task. Remember to take breaks often, go outside for a walk, and gives your eyes and brain a rest!
- Be sure to check the Manage Duplicates tool regularly, as HubSpot calculates possible pairs every few weeks. Keeping up on this task will help you maintain a clean and accurate database.
To wrap it up, effectively managing duplicate records in HubSpot requires thoughtful consideration and analysis. By configuring key properties for review, you can streamline the process of analyzing and selecting records to merge. Additionally, exploring auxiliary tools can introduce more automation into the process. By adhering to these recommendations and regularly monitoring the Manage Duplicates tool, you can uphold a clean and precise database for your business in HubSpot.
If you found this article helpful, you may also enjoy:
- Why HubSpot's Customer Relationship Management is the Best
- Want Better Results? Organize Your CRM System
- How To Use The HubSpot CRM Like A Pro
subscribe to get the latest in your inbox.
Subscribe to our blog to get insights sent directly to your inbox.