CRM Clean & Enrich: Diagnostic Deep Dive

This article is a detailed deep dive into our CRM Clean & Enrich process, what each of our diagnostic categories mean, and how they can be resolved. 

For a more introductory overview of this process, and why we care so much about this, you may want to begin with our DataFox CRM Clean & Enrich Overview.

Example of a Diagnostic Chart

The below is a sample client CRM diagnosis where DataFox matched CRM records to DataFox company profiles:

Category Definitions

Before matching and enriching anything, we rigorously test each record for errors. Why do we do anomaly detection?

Anomalous Categories:

  • Irregular: Something was potentially wrong with the record. 
  • Duplicate: Two or more records appear to belong to the same company. 

No Anomaly Detected: 

  • Verified: The company record passed all tests, and no anomalies or duplicates were found. 

Anomalous Category: Irregular

Irregular: Missing URL

Description: The Account submitted did not have a URL associated with it.

Irregular: Invalid name

Description: The Account name submitted was unintelligible.

Example:

  • ##@@$$$%%%

Action: You can either (1) ignore these cases (and we will not match these accounts), or (2) amend the name and re-submit for matching.

Irregular: Mismatched Query Data

Description: We detect any cases where the Account name points to one company and the URL points to another. Often, these cases are just fine, but we flag them to let you decide how to handle them. 

Example:

  • Agricon Global Corporation                   bayhillcapital.com
  • Arvato Hightech                                   bertelsmann.de

To avoid making errors, we do not match just on Name or URL because the Name and URL may match to different companies. On one hand, many records are fine as the company Name and URL are just very different for that particular company. On the other hand, many records are not fine as the Name-URL mismatch is an actual problem. Because DataFox always errs on the side of avoiding incorrect matches, we do not just "guess" at whether the Name or URL is more reliable. 

Action: You can either (1) ignore these cases (and we will not match these accounts), or (2) spot check the results and choose whether DataFox should match to the Name or URL.     

Irregular: Email

Description: We flag URLs that look like email addresses to prevent the false matching of companies like Google, Yahoo or any other major email domain provider.

Example

  • United                                      united@gmail.com
  • Insight Software                        insightsoftware@yahoo.fr

Action: You can either (1) ignore these cases (and we will not match these accounts), or (2) amend the URL and re-submit for matching.

Irregular: Parked Domain

Description: Our matching algorithm visits EVERY URL and detects if a URL redirects to a Parked Domain (e.g., GoDaddy.com, HugeDomains.com, etc.).

Example

  • advancedtechnologiesgroup.com >> domainmarket.com 
  • altomines.com >> hugedomains.com 

Identifying parked domains can also indicate if a company has gone out of business or changed its URL domain. 

Action: You can resolve parked domains by: 

  • Flagging the accounts as dubious; 
  • Removing the 'dead' URLs and leave the account records alone;
  • Investigate the records to find the correct URLs and re-submit to DataFox for matching.

Irregular: Vanity URL

Description: Some organizations do not have their own URL, but rather maintain a profile on another company directory (e.g. Facebook, Wikipedia, Etsy, AngelList, or Crunchbase). We categorize these records as 'Vanity URLs'. 

Example

  • facebook.com/bobs-hardware
  • etsy.com/marys-necklaces

Vanity URLs can indicate a company is small or has a minimal web presence.

Action: Spot check the results. In some cases, the Vanity URL can be replaced by the actual company website URL. In other cases, it's an obscure company without its own website.

Irregular: Invalid URL

Description: Your Account URL does not direct to a company website. The URL is either corrupt, mis-formatted, missing a character, or not a website at all.

Example:  

  • http:
  • http://zazzle.cccom
  • http://actel...com
  • nourl

Action: You can either (1) skip the records or (2) find correct URLs and then re-submit to DataFox for matching. 

Irregular: Shortened URL

Description: Sometimes, company records contain a URL shortcut, rather than the company's actual URL.  

Example

  • bit.ly/abc
  • ow.ly/xyz

Action: You can either (1) ignore these records or (2) find the corrected URL and then submit to DataFox for matching. 

Anomalous Category: Duplicate

Duplicate: Name Duplicate

Description: Two or more different account records share the same or similar name.

There are 3 different cases of Name Duplicates can occur:

Case 1: The Name Duplicates are actually duplicates records for the same company.

Example:

  • Three records all named "Walmart". 

Action: You can choose to link or merge the records in your CRM. 

Case 2: The Name Duplicates are actually different companies, but just happen to have the same name. 

Example:

  • There are multiple companies named "Insight Software".

Action: Ignore. There's nothing actually wrong. 

Case 3: The Name Duplicates have fake, dummy, or placeholder names. 

Example:

  • Multiple records are named "XYZ", "No Name", "NewCo", or "Test". 

Action: Find the correct names or flag the records as dubious.  

Duplicate: Redirected URL Duplicate

Description: Our matching algorithm visits each URL to detect if a given website redirects to (1) to a unique URL or (2) the same URL for another Account record. 

Example: Two Account records with very different looking URLs redirect to the same page:

  • aamyers.com >> allanmeyers.com
  • americaninfrastructure.com >> allanmeyers.com

Action: You can either (1) update the URLs to the correct company website(s), (2) merge the duplicate records (3) link them to each other if the Accounts are related to each other (e.g. subsidiaries, divisions, or acquisitions).  

Duplicate: Post-Match Duplicate

Description: Some account records are duplicates even if the Account names and URLs appear to be different. This can occur because the same company can have multiple names and URLs.

DataFox is able to detect these Post-Match Duplicates because DataFox company profiles store all names and domains owned by the company. In the below example, our system knows hotwater.com and aosmith.com are owned by AO Smith; and AO Smith Water Products Company is a division of AO Smith, so these two records are Post-Match Duplicates. 

Example:

Action: You can choose to link or merge the records in your CRM. 

Duplicate: URL Duplicate

Description: Two or more Account records have the same URL domains. 

Example

Action: You can choose to link or merge the records in your CRM.

Duplicate: Name and URL Duplicate

Description: Two different account records share the same Name and URL.  

Example

  •  

Action: You can merge or link the two account records together.

No Anomalies Detected: Verified

All Accounts marked as Verified have passed all of the above tests and are potential matches to DataFox company profiles.

Verified: Automatch

Description: Your records matched to a DataFox record with high confidence.

Action: No action necessary. Your record is ready to be enriched with DataFox info. 

Verified: Human Matching

Description: While your record passed all of our anomaly tests, it did not match to a DataFox company profile with high confidence. 

This means either (1) the Account record details point to more than one DataFox company profile or (2) DataFox needs to create a new company profile.

Because we take every precaution to only deliver high confidence matches, we have our human analysts assess and match each of these records. We can also provision new company profiles for missing records.  

Action: No immediate action necessary. We will follow up with you after our analysts have assessed and matched these records. 

Verified: Already Matched

Description: For many clients, we match and re-match Accounts multiple times over the customer lifecycle. Any Account records labeled 'Already Matched' indicate the record was matched during a prior matching process. 

Action: No action necessary. 

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request
Powered by Zendesk