Re-identification attacks happen – but here’s what you can do about it
The biggest risk when sharing sensitive data for research and analytics lies in the re-identification of individuals in the data. Re-identification leads to a breach, which will damage your organization’s reputation and finances. Latest numbers show that in the event of a data breach, the average cost to the organization is $208 per re-identified individual. These costs include notification, legal fines, regulatory fines, and more. De-identification enables organizations to mitigate this risk, but it needs to be done well for the organization to be truly defensible.
Re-identification results when a record is correctly tied to the person behind that data, even if the data was thought to have been made anonymous. Re-identification attacks happen for various reasons – sometimes it’s because of monetary gain, personal gain, or curiosity. These attacks occur because an attacker has the skills, resources, and motivation to find people hidden in the de-identified data.
There are three types of attacks:
- The first type of attack aims to re-identify a specific person and relies upon preexisting knowledge about a person known to exist in the de-identified database. We term the risk for this as Prosecutor Risk.
- The second attack also aims to re-identify an individual but instead uses access to another source of public information about an individual or individuals that are also present in the de-identified dataset. We term this Journalist Risk.
- The last attack involves re-identifying as many people as possible from the de-identified data even if this means some of them will be incorrectly identified. This last one is known as Marketer Risk.
All three impact the overall risk of re-identification for a dataset. Make sure to read De-Identification 301: Three Adversaries Who Could Attack Your Data to learn more about how these three adversaries can attack your data. Get your copy here and learn what motivates attacks and how to avoid these risks before they prove costly.
- When to Integrate Anonymization of Documents and DataSeptember 26, 2019
- Deep-Diving into Re-identification: Perspectives On An Article In Nature CommunicationsSeptember 26, 2019
- Learning at Scale: Anonymizing Unstructured Data using AI/MLSeptember 26, 2019
- Early Impact of Health Canada’s New GuidelinesJune 21, 2019
- Privacy Analytics Events 2019April 10, 2019
- GDPR and The Future of Clinical Trials Data SharingMarch 18, 2019
- Advancing Principled Data Practices in Support of Emerging TechnologiesMarch 15, 2019
- “Zero Risk Does Not Exist”February 7, 2019
- Is Anonymization Possible with Current Technologies?January 9, 2019
- Comparing the benefits of pseudonymisation and anonymisation under the GDPRDecember 20, 2018