Insycle Blog

How to Merge Duplicates in HubSpot and Salesforce and Keep them Syncing

Written by Ryan Bozeman | May 14, 2020 4:20:50 AM

Have you ever tried to deduplicate contacts across HubSpot and Salesforce, while the two platforms are syncing? If you have you know that it can get pretty complicated.

Fixing duplicate records in your CRM is difficult enough without having to worry about all of the issues that come along with two platforms syncing.

Both HubSpot and Salesforce have unique ways of handling duplicate records that need to be known and accounted for while merging duplicates. Failing to account for this could mean leaving many duplicate contacts floating in your system, unaccounted for, despite the fact they share a lot of data in other fields.

But the complications don’t just end at platform-specific nuances. When merging duplicate records across CRMs, you need to maintain the sync. Therefore, you need to determine the appropriate “master record” to use. But that is often easier said than done.

To ensure that the merge is correct and that you have the right master record in both platforms, you’ll need to manually check. Across a customer database that might have thousands of duplicate customer records, the task can quickly become overwhelming.

Then you have to think about how the merging process actually works. If two records are merged on Salesforce, are they merged on HubSpot as well? But even that can get a little murky. The real answer is “sometimes, depending on your settings.”

This leads to companies having to go through a tedious process, back and forth between both platforms, to run tests and make sure they understand how things work, check records IDs, ensure they are merging the right records, and keeping the sync intact.

This complicated deduplication process is time-consuming and painful — even when we are even just talking about a single pair of duplicate records on both platforms.

But consider how complicated this becomes when we have more than two duplicates on both HubSpot and Salesforce. What if a particular contact has six duplicate records on Salesforce and five duplicates on HubSpot? Will HubSpot and Salesforce even be able to recognize all of them as the duplicates that they are? Will you need to merge duplicates on both platforms separately?

Then there are considerations for how those duplicates are associated with other record in your databases. Is the master record associated with the correct company and deals? Do other duplicate records have associations that may become broken during the merge? Your account-based marketing teams might depend on those associations.

How do you untangle that web? How do you merge all records in a way that makes sense, while also ensuring that the records remain synced across the two platforms and nothing is broken in the process?

In this article, we’ll explain how the deduplication process works between HubSpot and Salesforce that are syncing, and explore why it is so painful. Then, we’ll take a look at how Insycle can help simplify deduping synced records and free you from being weighed down by repetitive manual tasks.

Let’s start by taking a look at why duplicate data is such a huge problem, particularly in two separate but syncing systems.

In this article, we’ll be specifically talking about dealing with duplicates between HubSpot and Salesforce — but similar issues exist with their own complications with any CRM integration or sync.

The Impact of Duplicates on your Sales and Marketing Efforts

The pain of duplicate contact records reverberates throughout a company. They cost you money, harm your reputation, and sap the motivation and sanity of your teams that deal with them each day.

This fact is perfectly illustrated in a quote from Jonathan Block, SiriusDecisions Vice President of Product Development:

"The longer incorrect records remain in a database, the greater the financial impact. This point is illustrated by the 1-10-100 rule: It takes $1 to verify a record as it's entered, $10 to cleanse and de-dupe it and $100 if nothing is done, as the ramifications of the mistakes are felt over and over again."

This is a statement that he originally made in 2008. Years later — when customer data plays an even larger role in everything that companies do and customer expectations are much higher — you might wonder if those per-record costs aren’t several times higher than they were then.

Still, the 1-10-100 rule illustrates a very important point — it’s best never to let duplicate data into your CRM at all — but once it’s there, it’s critical that it is fixed quickly or the costs will grow exponentially over time.

67% of businesses rely on CRM data to target and segment customers. Without reliable duplicate data handling processes in place, it can be impossible to engage with customers in a personalized and effective way.

Duplicate data causes many issues within a company, but those problems compound when we are talking about separate pairs of duplicates across synced platforms.

Lost Productivity Across All Teams that Use Customer Data

Even in simple cases  — without the complications of trying to untangle multiple duplicates across connected systems — duplicates are a productivity killer.

Think about it. Without consistent deduplication processes in place across your customer data, your teams are forced to deal with the problem themselves, manually.

Your marketing teams will need to dig into your data before sending campaigns out to fix obvious problems. You don’t want to deliver the same messaging to the same person multiple times. Or call them “james” instead of “James.” Then, the chances that marketing will be able to catch all of the issues manually or using complicated Excel functions are slim.

Your sales teams will have to dig through customer databases to check for duplicate records before engaging with prospects. They have to make sure that they aren’t missing important context that is split up among multiple records, or else they risk going into a sales call looking like they haven’t done their homework.

Then sales reps have to choose the “right” record to enter data and notes from their call into. Sales teams depend on productivity and speed to meet their goals. Adding another step to their process, like double-checking for duplicate records, will certainly impact their results.

Your customer support and success teams have the same problem. They rely on context to provide an excellent experience to every customer that they engage with. When that context is split up between multiple customer records, they either have to search for them manually or settle for providing a lesser experience.

Those double checks and pit stops are all terrible for the productivity of those teams. Checking for duplicate records has to be built into their process every time that they engage with a customer.

Now imagine if a record didn’t just have one pair of duplicate records, but five. Or five duplicates on two different platforms, resulting in 10 independent records for a given customer. How can they ever determine what the “right" record is? Do they even have time to sift through all of those records for the tidbits of context that they need? Likely not.

Advanced duplicate data situations like this leave your teams with a trade-off — productivity or effectiveness. They have to choose whether they want to waste time diving into the different records or ignore them and accept the hit on their results. That’s a position that no company should put their teams in if they can help it.

Harmed Brand Reputation

Duplicate customer records can also harm the reputation of a brand. Consumer expectations are shifting. Today, 51% of consumers expect that companies will anticipate their needs and make relevant suggestions before they make contact. Companies are expected to hit the ground running in every engagement. You need a full picture of every contact in your system to make relevant suggestions and engage with them in a way that makes sense, based on their situation.

Imagine being a customer and receiving the same messages multiple times across several of your different emails.

Or maybe you can relate to dealing with a company and feeling like none of your previous interactions are being taken into account. Every time that you engage with a new person, it feels like you are starting from square one. It’s easy to see how this would quickly become aggravating.

Additionally, when dealing with many pairs of duplicates across multiple platforms, these problems compound.

Now there are more records that your team has to check for context. There are more vectors for personalization errors. When you do merge duplicates, you could break the sync for those records between the two platforms, further making the data harder for your teams to find and have confidence in.

Those types of situations are common when you have duplicate data — whether they are duplicate contacts, companies, accounts, or leads. Your teams will lack the context that you need to engage with prospects effectively and ultimately harm your reputation with those prospects and customers.

A Shattered Single Customer View

Maintaining a single customer view is critical for delivering good experiences to customers and prospects. That single customer view gives your teams — marketing, sales, and support — a single source of truth for engaging with customers.

Only 8% of companies report having attained a single view of the customer to orchestrate personalization across channels. For most companies, it’s something that they are always striving toward.

With every duplicate contact record, that single customer view is eroded. One source of truth becomes two. Or three. Or four. Now there is no source of “truth” — just a collection of records that, together, offer a disconnected view of what the truth might be.

And in situations where customer records are not synced between HubSpot and Salesforce, three duplicate records become six. Now there are more records that your team will have to sift through to have a full context for their interactions and messaging. More opportunities for mistakes and personalization errors. Ultimately, this means a lesser experience for your prospects and customers.

Managing these situations is difficult. Each platform has its own nuances in how it handles merging duplicates. Then, there will be specific considerations for how the platforms interact and share data.

In the end, you just want a simple way to maintain a single customer view. But getting to that point can be a huge headache.

Related articles

How Insycle Solves Common Problems with HubSpot and Salesforce Integration

Fixing Non-Matching State & Country Fields That Break the HubSpot and Salesforce Sync

Managing Painful Data Issues Between Your CRM And Third-Party Integrations

 

Why Managing Duplicates When HubSpot and Salesforce are Syncing Is Such a Huge Headache

Managing duplicates on connected HubSpot and Salesforce systems is a good example of an issue that might appear simple on the surface, but once you dive in and start to discover the intricacies of the issue — you find that it becomes more complicated by the minute.

Let’s start by looking at how HubSpot and Salesforce handle merging duplicates when syncing in the best-case scenarios. Then we’ll take a look at more advanced cases to illustrate just how tangled the problem can become.

Duplicate HubSpot Contacts and Salesforce Contacts or Leads

HubSpot deduplicates Salesforce leads and contacts by matching email addresses. That is the only field that is used to identify duplicates. Often, this is enough to clean up a majority of duplicates, but if you have the same person, with the same name, company name, and other fields in both databases — they won’t be identified as a duplicate and therefore will not merge during the merging process if they do not share the exact same email address.

Two contacts could be nearly identical but use two different email addresses — jane@acme.com and jane@acme.co.uk, or personal and work email address for example — and not be identified.

This means that you’ll have to use a manual process or complicated Excel functions to identify and merge these duplicates on your own in both systems. If you merge duplicates in Excel, you have to keep a close eye on the records associations between the two platforms to ensure that when you are done, the sync remains in place.

But even when a pair of duplicates across both platforms do share an email address, there are other considerations that can make cross-platform deduplication difficult.

HubSpot does offer some automatic deduplication features while the sync to Salesforce is active that can be helpful.

When a new contact record is created, HubSpot will search Salesforce for records that have a matching email. Salesforce will return all the records that match and HubSpot will pick one of them to sync with HubSpot. On subsequent updates in Salesforce, HubSpot will sync with the most recently updated contact.

But there is no guarantee that the record picked is the record that you want to sync to. If you have duplicates in your Salesforce database, a record might be the most recently updated but contain lower quality data than another record.

Deduplicating Contacts from Salesforce

When you merge duplicates in Salesforce directly, you have to make sure that it results in merging all duplicates into a master record that is syncing with HubSpot. If you merge into the wrong record, the sync would become broken and might result in discarded data.

When the sync breaks, you’ll see an indicator on the sync card for the corresponding HubSpot contact.

Figuring out the “right” record to merge into can be tedious, and requires you to analyze the data, by hand, on both platforms.

To identify which Salesforce record is syncing with a HubSpot record, you have to navigate to the HubSpot contact. In the left panel titled About This Contact Card, click View All Properties.

There, you can search for the Salesforce Lead ID or Salesforce Contact ID property. You’ll have to save the value, then find the corresponding record in Salesforce using that ID just to identify which records are syncing with each other.

Switching back and forth between platforms to make sure that you get this very important step right is tedious, but necessary. Imagine having to do this for hundreds or even thousands of duplicates — you’d want to tear your hair out.

You also have to consider your HubSpot and Salesforce sync settings. This is important because having the wrong setting could result in lost data on HubSpot.

In the HubSpot and Salesforce sync settings in HubSpot, you’ll find a section that allows you to choose how specific changes to data are handled on HubSpot.

 

Here you can choose whether or not a contact or lead is deleted when the corresponding contact has been deleted in Salesforce.

It seems like something you would want, right? You wouldn’t think that if you delete a contact record in Salesforce that you want it to remain in HubSpot.

But the issue is that when deduplicating contacts, records can be accidentally deleted. If HubSpot contacts are set to be deleted when the corresponding Salesforce contact is deleted, when duplicate contacts are merged in Salesforce, the duplicate contact in HubSpot will get deleted, and not merged into the corresponding master in HubSpot. That could result in lost data.

On the other hand, if HubSpot contacts are not deleted automatically by the sync upon merge, you’d end up with lingering duplicates in HubSpot that you would now have to deduplicate directly in HubSpot.

Deduplicating Contacts from HubSpot

The process of deduplicating contacts in HubSpot with the sync active is similar to doing so directly in Salesforce.

When you merge in HubSpot, and in order to keep the sync active, you would need to follow the same process of identifying the contacts that are currently syncing with Salesforce, and choosing that record as the master record. That means sifting through your records and cross-checking Salesforce IDs, which is a time-consuming manual process.

Duplicate HubSpot Companies and Salesforce Accounts

Duplicate companies and accounts are a critical issue for ABM teams. With duplicate companies, contacts might be associated with different company records, causing your teams to overlook important stakeholders and miss opportunities.

It is possible to have duplicate companies in HubSpot, irrespective of Salesforce. For example, when people import new contacts and companies into HubSpot, new duplicate companies may get created.

Additionally, HubSpot’s auto association feature can create duplicate companies from contacts’ emails. For instance, different subdomain are used, john@math.school.edu and jane@science.school.edu will result in to companies for the same school. And similarly, country code top-level domain acme.com and acme.co.uk.

Company records can also be created by outside integrations with third-party software. Those systems may or may not have their own rules for creating new company records or identifying existing records, so it is important that you understand how those integrations work.

With respect to the Salesforce sync, HubSpot can deduplicate companies using the Salesforce Account ID Field. When contacts are syncing with Salesforce, HubSpot will associate the contact to the company with the matching Salesforce Account ID. If one does not exist, HubSpot will create a new company and associate it to the contact.

Deduplicating Accounts from Salesforce

Deduplicating accounts in Salesforce is possible. However, when you deduplicate accounts in Salesforce, the master is kept in sync with HubSpot, but the duplicate company in HubSpot remains. It doesn’t have Salesforce Account ID anymore because the duplicate in Salesforce was merged.

This means that you may be able to solve a duplicate accounts problem in Salesforce, but the disorganization remains in HubSpot, with no easy way to deduplicate them. Those duplicate companies linger orphaned in HubSpot, cluttering your database. The problem compounds since you cannot deduplicate companies in HubSpot when it is syncing with Salesforce.

Deduplicating Companies from HubSpot

One big caveat is that you cannot merge duplicate HubSpot companies when the HubSpot-Salesforce integration is active.

To merge HubSpot companies, you would have to uninstall the integration, deduplicate your company records, then reinstall the integration. This is a major undertaking that could lead to a list of additional problems as a result.

Account-based marketing and sales are impacted. You may still have contacts that are associated with the wrong company records, forcing your teams to sift through the data manually for context. In many cases, they will miss the context entirely.

Duplicate HubSpot Deals and Salesforce Opportunities

Deduplicating HubSpot deals and Salesforce opportunities is another area of issue for sales teams. Neither platform provides a native way to dedupe deals or opportunities, and as a result there is no way to dedupe across both systems while the sync is active either. 

Still, there are situations where duplicate deals and duplicate opportunities arise and can cause confusion and problems.

The Painful Process of Deduplication and Syncing for HubSpot and Salesforce

The situation becomes increasingly complicated when you are dealing with a larger number of duplicate records.

You have to choose the right master record. You have to make sure that the master record is appropriately synced between both platforms. You have to make sure that you are effectively merging the records to maintain customer data and facilitate a single customer view.

All of these separate factors culminate into what can feel like a situation that is impossible to manage. The whole problem of syncing HubSpot and Salesforce deduplication is a little hard to wrap your head around, let alone execute.

But using Insycle, you can simplify the process and ensure that you are consistently deduplicating across both platforms while ensuring your records remain in sync.

Insycle — Complete Duplicate Management Solution for HubSpot and Salesforce

Insycle takes this process of deduplication across HubSpot and Salesforce databases and simplifies it, without all of the manual checking and confusion. With Insycle, you can install a duplicate handling process that delivers reliable and consistent rules and behavior across both platforms.

With Insycle, you can ensure that both your HubSpot and Salesforce databases are deduped and that the sync remains active. It is accomplished by “tagging” the master record when merging in one of the CRMs, then using that tag in the master selection rules in the other CRM. The result is a pair of master records, one on each platform, that remain in sync.

You can start your deduplication process from either Salesforce or HubSpot, and then deduplicate in the other CRM.

Let’s break down how it works step by step. In this example, we’ll start by merging in Salesforce followed by merging in HubSpot, however you can use the same process in the opposite direction, that is, first in HubSpot and then in Salesforce.

Step 1: Create Custom Fields to Tag the Master Record

In order to tag the master you would need to create a custom field in both platforms to capture the master tagging. This is a one-time setup.

The new custom field needs to be named “Deduplication Master Record” and it needs to be added to any record that you plan to deduplicate. 

Insycle will automatically populate this field with the right value. To prevent users from accidentally changing its value, you may want to hide this field from the default layout or make it non-editable from the view.

Add the Custom Field in Salesforce for Accounts

  • Label: Deduplication Master Record
  • API name: Deduplication_Master_Record__c
  • Data type: checkbox

 

Add the Custom Field in HubSpot for Companies

  • Label: Deduplication Master Record
  • API name: deduplication_master_record
  • Data type: single checkbox

 

Next, in the Sync Settings, you'd need to set it to copy the value of the custom field from Salesforce into HubSpot (one way).

 

You can follow the same process for Contacts, Leads, Opportunities and Deals, and create the appropriate custom field and sync mappings.

Step 2: Deduplicate Salesforce Contacts in Insycle

Now you can start the process of merging Salesforce duplicates with Insycle, in bulk and automatically.

You can go through the deduplication process just as you would if you were doing so without the sync in place. In this example, we deduplicate by Salesforce Accounts using:

  • Account Name: ignoring common terms like Inc., Incorporated, LLC. such that "amce inc" and "Acme Incorporated" are matched. 
  • Website URL - ignoring top-level domain, subdomain, and protocol, effectively using the domain name for comparison. For example: acme instead of www.acme.com, http://acme.com, https://acme.co.uk, uk.acme.com

During this process, you can set flexible rules for choosing a “master record” that all of the other duplicate records will merge into.

 

As part of the merge process, Insycle will automatically populate the “Deduplication Master Record” with the value “TRUE” for the record that is chosen as the master, based on the Master Selection rules.

Step 3: Use Insycle to Deduplicate Your HubSpot Records

With Insycle you can deduplicate HubSpot companies in bulk and automatically, even when the sync with Salesforce is active, you do not need to uninstall the Salesforce integration from HubSpot.

After the merge in Salesforce, the value of “Deduplication Master Record” which was set by Insycle will automatically sync from Salesforce to HubSpot and can be used in the "Master Selection" criteria.

In the “Master Selection” step, set only one rule “Deduplication Master Record” is True (Yes). This will ensure that the master record on HubSpot aligns with the master record on Salesforce. The “Deduplication Master Record” value is available in HubSpot due to the sync.

Run the deduplication process in Insycle as you normally would. Now, all of your duplicates can be merged, across both HubSpot and Salesforce, while maintaining the proper master record and maintaining the sync.

No more duplicates. No more having to check Salesforce Account IDs and double-check to make sure that you are picking the right master record. No more confusing webs of duplicates across multiple platforms. You run deduplication in both CRMs, and tie the syncing together using Insycle’s advanced deduping features.

All you have to do is add a custom field to your record types in both platforms, and Insycle will handle the rest.

With Insycle you can run the deduplication process in preview mode to review the changes before they're updated in the CRM, generate a CSV report of the duplicates, and set up deduplication to run automatically on recurring basis.

Putting it all together: An example

In this example we had a duplicate account in Salesforce and corresponding duplicate company in HubSpot.

Salesforce HubSpot
Acme - Account ID 0011N00001sZkioQAC Acme - Salesforce Account ID 0011N00001sZkioQAC
Acme - Account ID 0011N00001sZkQHQA0 Acme - Salesforce Account ID 0011N00001sZkQHQA0

 

We used Insycle to bulk merge Accounts in Salesforce, and Acme was one of the accounts that got merged.

As part of the merge Insycle automatically populated “Deduplication Master Record” for the master Account ID 0011N00001sZkQHQA0 in Salesforce.

After the merge, the Salesforce<>HubSpot sync propagated two changes to HubSpot:

  1. Updated the “Deduplication Master Record” field for the company linked to Salesforce Account ID 0011N00001sZkQHQA0 to match the value set by Insycle in Salesforce.
  2. Removed the Salesforce Account ID from the company that was linked to Salesforce Account ID 0011N00001sZkioQAC because that Account does not exist anymore in Salesforce.

Here is a screenshot showing the companies in HubSpot after the merge in Salesforce.

As you can see, the company linked to Salesforce Account ID 0011N00001sZkQHQA0 is tagged as the master, and we have an orphaned company, a duplicate company not linked to Salesforce.

Next, we deduplicate companies in HubSpot and the goal is to retain the company linked to Salesforce Account ID 0011N00001sZkQHQA0 in order to maintain the sync with Salesforce. 

We accomplished that by using “Deduplication Master Record” is "Yes" in the Master Selection rules. That ensures that the Acme company linked to Salesforce Account ID 0011N00001sZkQHQA0 is picked.

Now there are no more duplicates is Salesforce and HubSpot and the merged records are syncing. We can run deduplication in bulk across the entire Salesforce and HubSpot databases and schedule it to run automatically on a recurring basis.

Insycle Unified Data Management across HubSpot and Salesforce

Integration between HubSpot and Salesforce is both necessary and difficult. There are many nuances between the platforms and the way they work and share data between them. Insycle helps you to navigate those nuances and focus on what matters — eliminating duplicate data in your customer databases and keeping your sync active.

Insycle goes beyond data deduplication, it is a unified data management solution for a variety of data cleansing, data operations, and data collaboration needs. Insycle makes it simple to manage, automate, and maintain customer data in order to improve the results of your sales and marketing efforts.

Unlike tools that tackle one data problem for one CRM, Insycle offers a complete solution for all your customer data management needs and it works across multiple CRMs in the same way. Often when companies adopt Insycle they consolidate and replace existing tools which results in a net cost save to the company. One training, one security audit, one vendor relationship - for all your data management needs. Your procurement and IT departments will be happy too.

Looking to simplify the process of managing duplicates across HubSpot and Salesforce that are syncing? Fill out the form below to start your free trial.