banner

Match is in our name, of course we do Fuzzy Matching!

What is it? Fuzzy Matching is a way to describe some of the tools we use to find similar values and similar records, within and across different data sets. Most Fuzzy Matching tools allow the user to pick columns from their data sets and allow the user to set specific thresholds of minimum similarity. For example, if you’re looking at a single list of customer information and looking for duplicates, you might select the “Customer Name” and the “Customer Address” columns and set the thresholds to 95, and the Fuzzy Matching solution should return pairs or groups of potential “Duplicate” records, where the data from the “Customer Name” and the “Customer Address” columns, seem to be about 95% similar.

Why do we use Fuzzy Matching? We do this because data is rarely entered the same exact way, changes to the information in processing make the same data even harder to match, data moves and transforms from system to system and from business process to business process, and there are hundreds or thousands of other reasons the data doesn’t match, including missing information, misfielded information, new information, formatting differences, abbreviations, nicknames, and a lot more.

Why doesn’t the data match? What starts as one of many initial contacts (and perhaps companies, addresses, phone number, etc.) in the marketing system, convert to one of many other contacts and much more information in the CRM system, with many changes to the information over time, later converting to product and billing information in other systems requiring more changes to the “same” data. The data downstream rarely matches the data in previous steps in the customer journey. And as a function of time, even the data that isn’t converting or hasn’t been used for some time, will still need to be updated or changed pretty regularly, to make it useful, which means newer versions of the “same” data will rarely match any previous versions of the “same” data. As data is cleansed and standardized in one system, if it’s done outside of the context of master data management, it’ll be less likely to match data in other systems. As that data migrates to new systems or as you deploy new systems to support new requirements, data is often cleansed or prepared for those new systems, which means it’s less likely to match earlier versions of the “same” data in other systems. As customers, vendors, suppliers, and business partners request,  require or  experience change  in their  orders with you, their staff, their contact information, their billing information, or the parts or products you’re purchasing from  them,  the data in your systems will also probably need to be updated, and each time the information is updated,  it’s less likely to match  earlier versions of the same data.

While many systems allow the user to pick columns from their data sets and allow the user to set specific thresholds of minimum similarity, the Match Data Pro Fuzzy Matching solution gives you a lot more flexibility and advanced configurability to get the very best results.