Does Data Masking Work?
Safe Harbor data masking looks like de-identification in disguise: don’t be fooled!
We have often heard that data masking does not work. There are a lot of data masking vendors in the marketplace that offer software and services which apply Safe Harbor techniques to diverse datasets. In theory, Safe Harbor’s approach effectively ensures HIPAA compliance and properly de-identifies data for secondary use. In actuality, there are a number of drawbacks to Safe Harbor which contribute to the impression that data masking is limited. So in the end, organizations are left to wonder: does data masking work?
Data masking refers to applying a set of transformative techniques to a dataset that remove the direct identifiers (variables that immediately identify individuals) and generalize elements like zip codes and dates. While effective at hiding individuals in the data, it offers very little information for secondary purposes. In short, data masking is pretty futile exercise.
Why is it so futile? There are a number of drawbacks to using data masking alone. As previously mentioned, data masking addresses direct identifiers. This is quite logical – after all, gleaning direct information about the individuals in the dataset is NOT what organizations want. They perform research and analytics on the indirect identifiers – information that while not immediately identifying, can potentially identify individuals when combined. Indirect identifiers can be tricky – after all, outliers pose risk. A hypothetical 98-year old Rwandan man in Juneau, Alaska would represent an outlier in the dataset. Without knowing his name, there is great risk in him being re-identified due to his age, ethnicity and location.
There are many other drawbacks to using data masking only – read them here, in our white paper: The Top 5 Drawbacks to Using Only Data Masking.
- One Year In: How the Opening of Health Canada’s Portal Affects YouMay 4, 2020
- Turn Data Assets into Business Opportunity Under CCPADecember 19, 2019
- How does risk-based anonymization work?December 18, 2019
- Why should I use Expert Determination over Safe Harbor?December 18, 2019
- What do I need to know about GDPR, HIPAA and CCPA to meet our regulatory and privacy obligations?December 18, 2019
- Putting our passion into action against COVID-19April 15, 2020
- GDPR and The Future of Clinical Trials Data SharingMarch 18, 2019
- Advancing Principled Data Practices in Support of Emerging TechnologiesMarch 15, 2019
- “Zero Risk Does Not Exist”February 7, 2019
- Is Anonymization Possible with Current Technologies?January 9, 2019