banner

We have Senzing Entity Resolution under the hood!

Senzing Entity Resolution

Senzing Story: IBM acquired Las Vegas-based Systems Research & Development (SRD) in 2005 for its real-time entity resolution technology known as Non-Obvious Relationship Awareness (NORA). IBM renamed and now sells this technology under the brand name IBM InfoSphere Identity Insight. It’s a unique product in use around the world. Much was learned.

In 2009, four years later, Jeff Jonas and team developed a vision to revolutionize entity resolution. The design aspirations included such things as: domain agnostic; self-tuning, self-correcting thus not dependent upon entity resolution experts; optimized for horizontally-scaling cloud compute infrastructures; Privacy by Design (PbD); relationship awareness; geospatial awareness; real-time; extensible; scalable; and easily deployed and operated.

Mid 2009, Jeff Jonas and team embarked on this ambitious journey – working on the design (blueprints) while at IBM in a skunkworks projects code named “G2.” After one year of design, the team started coding the core engine (in C++). On Data Privacy Day 2012, two and a half years after inception, the G2 technology became commercially available. Thousands of copies were shipped. One such example described here in this 2018 New York Times story entitled “Another use for A.I.: Finding Millions of Unregistered Voters.”

About Senzing® entity resolution: Senzing® entity resolution (ER) discovers common entities and relationships within data to provide a complete inventory of every record related to each person and company.

The Senzing® entity resolution technology resolves entities using a unique principle-based approach that delivers higher quality entity resolution results than other methods. Principles also make Senzing® software easier to deploy and virtually eliminates the need for pre-training, tuning or experts.

The Senzing principles are designed for use on a wide range of entity types, including people, organizations, vessels and vehicles. They use a special form of generalized knowledge that draws on common truths or assumptions. These principles are distinctly different than the rules used by many other entity resolution methods. Here’s an example:

You tell your child to quit throwing rocks at cars, which is a rule. The next day you find him throwing baseballs at SUVs and have to tell him not to do that too, another rule. A few days later, you have to tell him not to throw golf balls at trucks, fire engines and ambulances, more rules. Instead of all these rules, why not one simple principle: Don’t throw things at other people’s stuff?

Senzing software uses principles to determine when entities are the same, possibly the same or possibly related. Our principles are based on the expected behaviors of entity attributes, e.g., names, addresses and identifiers. For example, social security numbers (SSNs) typically point to only one person, but dates of birth (DOBs) are shared by many people.

Expected behaviors always have exceptions and Senzing software learns about them in real time as new data is received. For example, when multiple people are using the same SSN, our software detects it, labels the SSN as generic and reevaluates all prior records with that number.

Senzing software comes preconfigured with the common attributes and expected behaviors of people and organizations. You can immediately start loading and resolving entities without any configuration, training or tuning.