The role of entity resolution and best practices in SGR databases
At SGR Compliance, data integrity and reliability are the cornerstones of our mission to provide clients with the most accurate and comprehensive information possible. One of our key challenges is consolidating data (either structured and unstructured ones) from various sources, to effectively identify and eliminate duplicates.
Entities—both individuals and organizations—can appear across multiple sources, often with variations in names, addresses, or identifiers. Without a robust deduplication process, clients could receive incomplete or confusing reports.
To address this, SGR employs the concept of Entity Resolution within its database to correlate information from diverse sources and create unique, comprehensive profiles for each entity. This approach drastically reduces duplication and improves data quality by leveraging a broader range of datapoints, enhancing the ability to accurately identify subjects.
For example, more than 80% of Politically Exposed Persons (PEPs) in the database developed by SGR include at least one of the following key data:
- year of birth;
- an identifying photograph;
- a personal ID code or professional background information.