When you deduplicate and merge records in your CRM, how do you determine which record remains and which one gets removed? How do you know that you are picking the right ‘master’ record? Making the right decision helps you to avoid unnecessary problems that can impact your ability to effectively engage with a prospect or customer down the road.
Duplicate contact data, companies, and deals make it difficult for your marketing teams to inject data into marketing automation campaigns, cause sales teams to waste time and make mistakes in the CRM and when engaging with prospects, and hinders the ability of support teams to provide a consistently excellent customer experience to every customer.
Duplicate customer data is much more common than most companies realize. The average duplicate rate in a database can be as high as 20%-30%. According to research from SiriusDecisions, ”It takes $1 to verify a record as it's entered, $10 to cleanse and de-dupe it and $100 if nothing is done, as the ramifications of the mistakes are felt over and over again."
You know you want to take care of your duplicates problem. It seems like it should be easy. You’re just merging two (or more) similar records, right? No problem. You’ll just keep the ones with the most accurate, updated information.
But the more you dig into your duplicates problem, the more you find out that merging duplicate contacts, or any duplicate record for that matter, is a bit more complicated than it seems on the surface.
Some of the issues that you’ll run into as you go through the process of deduplicating CRM data include:
In this article, we’ll deep-dive into how to go about choosing the right master record when merging duplicates to improve CRM data deduplication effectiveness and minimize data loss.
Insycle — Flexible Master Record Selection Rules & CRM Deduplication
CRM deduplication is the process of merging duplicate contact data, companies, and deals in your CRM system. These duplicates may be exact match duplications of another record, but often are partial matches, meaning that there is only partial data overlap between the records.
In this article, we’ll be using a couple of terms, let’s describe them and explain what they mean:
With this in mind, let’s dive into why it is so important that you pick the right master record when deduplicating.
Picking the right master records ultimately determines the effectiveness of any CRM deduplication campaign. Luckily, while picking the right master record is important, it isn’t necessarily difficult.
Here are a few of the key reasons why choosing the right master record is important.
Choosing the right master record is often a very important decision, although that is not always the case. If you accidentally imported the same .CSV twice, because all of your duplicate groups are exact matches, choosing the right master selection here is probably not important. There may be other situations where your records are shallow and you are simply concerned about searchability and wouldn’t experience any negative side effects from designating either record as the master.
Depending on your internal data processes, there may be a clear, rule-based way that you choose a master (such as the first created record) that will effectively merge duplicate entries while retaining important data.
Let’s consider an example. Let’s say you have two records with different emails, but the same address, phone, and other data.
If you pick the record with jane@gmail.com instead of jane@acme.com, all you’ll need to do to fix the issue is change the primary email on the contact. Both emails are retained. You just make a simple swap and all of the other values — notes, email, etc. — are retained.
But, with that said, there are some effects that you’ll feel from choosing the wrong master record. Those include:
The best way to merge duplicates is through custom merge processes that help you to define a step-by-step process for choosing master records, and avoiding leaving it up to guesswork each time that you deduplicate.
Sometimes determining the right master record can be complex. Across, data can be incomplete or differ in a variety of different ways. Some records may be missing critical data, but offer more complete data in other fields.
Let’s assume that you have two different records that have been identified as duplicates. They might look a little something like this:
First Name |
Last Name |
|
Company |
Job Title |
Company Size |
Dawn |
Smith |
d.smith@acme.com |
Acme Inc. |
CMO |
50 |
D |
Smith |
d.smith@gmail.com |
Acme Inc. |
Chief Marketing Officer |
N/A |
This presents a pretty straightforward duplicate merging situation and a simple choice for your master selection.
The top record contains a more complete record for this prospect. It includes Dawn’s full first name, while the second record includes only an initial. The top record features her business email, rather than a personal GMail account. It also includes industry and company size data that isn’t present in the duplicate entry.
Here, choosing the master record is easy. One record is clearly the right choice.
There are often situations, however, where the ‘right choice’ is not as clear. What if the two records looked like this:
First Name |
Last Name |
|
Company |
Job Title |
Company Size |
Dawn |
Smith |
d.smith@acme.com |
Acme Inc. |
CMO |
50 |
Dawn |
Smith |
d.smith@acme.com |
Acme Inc. |
VP of Marketing |
100 |
Now the waters get a little murkier.
Here the records are the same, except one lists Dawn’s job title as “CMO” while the other lists it as “VP of Marketing.” The company size has different figures as well. Which record is the right one? There will be differences in the types of marketing messaging Dawn will receive from your marketing and sales teams depending on her role.
But it can get even more complicated.
First Name |
Last Name |
|
Company |
Job Title |
Company Size |
Dawn |
Smith |
d.smith@acme.com |
Acme Inc. |
CMO |
50 |
D. |
Smith |
d.smith@gmail.com |
Acme Inc. |
VP of Marketing |
100 |
Dee |
S. |
dawn.smith@acme.com |
Acme |
Chief Marketing Officer |
N/A |
Three different first names. Two different last names. Three different emails. Two different companies. Two different job titles with standardization issues. Conflicting company size data. All spread across three different “duplicate” customer records.
Most companies default to choosing a ‘master record’ with the earliest creation date. That will be the right choice for many of your duplicates but will cause issues with a certain percentage every time.
Choosing the right master record here is important but tricky. You want to make sure that you have the most accurate customer data. There are several possibilities here. You know that, at minimum, one record contains inaccurate data.
But simply defaulting to the first created record might be an issue. Based on these three records, it seems plausible that Dawn started out as a CMO at Acme and was later promoted to VP of Marketing as the company grew. There is a chance that the middle record may be the most accurate record while also being the most recently created. But that doesn’t mean the other records don’t have more accurate or updated data in some specific fields.
If you were to merge these records using the earliest creation date, you risk losing an accurate profile for Dawn.
This example illustrates how complicated deduplication can be and why choosing the right master record is so important for retaining accurate, quality data. It can get more involved, as you’ll often see more than three duplicates for a single contact or company.
Salesforce and HubSpot are two of the most popular, feature-rich CRM systems available today. Naturally, because HubSpot originally was a marketing-focused platform and Salesforce is sales-focused, the two have become a natural pairing for many companies. Those that do use both easily recognize the benefits of having reliable data syncing between the two.
Some small but aggravating data problems can come from syncing Salesforce and HubSpot. Bad data can break the sync altogether. If the sync is broken for an extended period, cleaning and reconciling that data can be a huge pain. Additionally, parent-child hierarchies between the two systems can be complicated and error-prone.
There are also sometimes issues with duplicates that arise once the Salesforce to HubSpot sync is in place. This is a huge problem, particularly for account-based marketing teams that rely on accurate contact-to-company associations.
Second, Salesforce duplicate leads will often sync with HubSpot, creating duplicates in both systems. When there are duplicate records in Salesforce that only share a name, or have a different email convention like dawn.smith@acme.com, dawns@acme.com, or dawn@acme.com, HubSpot will not identify these as duplicate records.
Duplicates are a complicated enough issue on their own, but when taking into account different cross-platform integrations and the complexities that come with syncing that data, the issue can become an even bigger headache.
To deduplicate records when the Salesforce to HubSpot sync is active, choose a master record that has either the Salesforce Account ID and Salesforce Contact ID populated.
Now let’s look at some of the most common ways that companies typically use to choose a master record when merging duplicates.
There are some common ways to pick a master record. Of course, the right way to choose a master record depends entirely on your specific situation. There may be something about the way that your company collects and utilizes customer data that makes one a better choice over another. There also may be different choices that would be considered ‘right’ for individual sets of matching duplicates.
Some of the most common practices for picking a master record when merging duplicates include:
The right way to choose a master record depends on how your company collects, stores, and utilizes your customer data. But, it is critical that you make the right choice, as there are some serious downsides to choosing rules for master records at random when bulk merging duplicates.
Customizing merge behavior can help you to limit mistakes, make better merging decisions, and take the guesswork out of deduplication processes.
Ideally, you’d have a multi-step process for determining master records. If the first condition is met, the master record is chosen. If it is not, then you continue down your list of master selection rules.
For instance, the master record selection rules for merging duplicate Salesforce companies might look something like this:
A multi-step process allows you to ensure that you have your bases covered and simplifies the master record selection process for your team.
Like any data cleansing process, deduplication and master selection have some recommended best practices that you should follow to give yourself the highest chance of success.
First, companies should have processes in place for generating previews of their deduplication results. This is especially true when you are first getting started and gaining an understanding of how your particular deduplication tools and processes work.Once your comfortable, you should look to cut back on manual review process and institute deduplication automation so that you can free yourself to focus on other data management tasks.
Reporting should also play a critical role in deduplication. First, as a way to share, collaborate, and gather feedback form your team. A report can help you to identify ways to improve your master selection and other deduping processes. Reports also have the added benefit as serving as a backup and audit trail for if anything did happen to go wrong.
Before making any live changes to your CRM data, it’s always a good idea to preview your data to see how your master selection rules affected changes.
Insycle is the ultimate deduplication software, making it easy to identify duplicate records and merge them in bulk using smart master record selection rules.
Once you determine the right way to choose a master record in your data set, Insycle makes it easy to create a straight-forward, multi-step process for master record selection.
Here’s an example:
Here, just as in our earlier example, we are checking a list of duplicates to see if a Salesforce ID exists. If a record does have one, all other duplicate records will be merged into it as the master.
If none of the records have a Salesforce Account ID, or multiple do, then we move down to the next step — the record with the highest number of associated deals. Then company owner, and finally, creation date. This ensures that you go through the list of rules that make the most sense based on your CRM data, while using the final “creation date” rule as a catch-all for records that don’t meet the other criteria.
Then, Insycle allows you to both preview the changes to your data before they go live (and see how your master selection rules end up playing out within your database), and generate a .CSV report of the changes.
Insycle makes it easy for companies to create multi-step, rule-based master selection processes. This is helpful because:
Are you tired of picking the master record by hand time after time, and manually analyzing fields of each duplicate record in order to pick the right master?
Sign up for Insycle’s 7-day trial and institute process-based deduplication into your customer data management strategy.