As an analytics leader or Chief Data Officer, you may be focused on leveraging your organization’s sensitive data assets to drive business impact — you’re likely looking at the potential of personal data.
As a Chief Privacy Officer or Data Protection Officer, your primary focus is likely the protection of personal data.
Whether you’re a CDO or CPO, in the context of U.S. healthcare, you could be up to your neck in HIPAA. And if you operate in the EU as a CDO, CPO, or DPO, or hold data of European residents, you’re in a similar situation with GDPR.
You’re likely consumed by three big questions surrounding data value:
To help you achieve these three learnings, you’ll want to review the comparison chart below, which offers side-by-side comparisons to assist you in aligning the best anonymization approach to your specific business needs.
If you’re a CDO, CPO, or DPO, you’re in a tricky position. The CDO advocates the effective use of data, while the CPO or DPO is mandated to support privacy. You need to work together. You need a strategy to unlock more value from personal data and, at the same time, remain judicious about protecting data privacy.
To use a sports analogy, as coaches of your team, you need to be equally focused on defense as on offence.
With all the compliance requirements, do you sometimes feel like you’re grinding it out on defense, and making little progress on your value-generating, business-transforming goals? Or perhaps, alternatively, you’re finally generating value—making tangible gains—but are kept awake at night by the fear you’re not fully compliant?
How about a glimpse into the playbook showing real business problems that some of the world’s top companies are solving with Privacy Analytics? Organizations with a leading record for maximizing data value, such as these:
These organizations are using effective anonymization techniques to not only commercialize sensitive data, but also to successfully employ data internally—reducing operational processes and taking full responsibility for sensitive data.
The common denominator for these teams is that they all leverage the added value beyond anonymization, with a repeatable process that minimizes subjectivity and can be operationalized successfully.
The bottom line: a win-win for data privacy and business value.
Many of the new, innovative and value-driving uses of data are enabled by anonymization because consent or authorization is impractical—sometimes even impossible—under privacy laws. Therefore, you need to be committed to anonymization to drive value.
However, if anonymization isn’t done well, several things can happen:
That’s why you need an effective anonymization strategy: one that has operationalized a contextual evaluation of risk.
Let’s look at the two prevailing approaches to data privacy: risk-based and rules-based de-identification.
When it comes to personal data, personally identifiable information (PII) and protected health information (PHI), producing a high-value dataset that meets specific needs must address privacy concerns. Doing so requires organizations to de-identify personal information using a risk-based approach that goes beyond simple masking techniques.
For a better understanding of why successful organizations choose a risk-based approach, consider the graphics later in this article. These graphics illustrate the major differences between the main approaches in a HIPAA context, which applies to U.S. companies working with PHI.
It’s worth noting that with the advent of GDPR, the bar has been raised. For companies who operate beyond the U.S., HIPAA-compliance is no longer ‘good enough’. And as new legislation is proposed and enacted, what was done yesterday to protect personal data may not be good enough tomorrow.
If you are in the healthcare sector, you already know HIPAA Safe Harbor, a rules-based approach that outlines 18 different identifiers that must be removed from data that will be used for secondary purposes.
These examples are:
HIPAA Expert Determination, on the other hand, is a risk-based approach that applies statistical or scientific principles to provide a very small, quantifiable risk that an anticipated recipient of the data could identify an individual. This approach also requires that the methods and results of the analysis are documented in a defensible compliance report.
Here, we’re comparing HIPAA Safe Harbor (a rules-based example that is easy to visualize) with HIPAA Expert Determination, which a risk-based example. To further simplify the illustration, we are only presenting a subset of data elements, and a small sample of records.
Please note that this example has been created for demonstration purposes only, and is not intended to reference actual persons.
Raw personal data can’t be used for most secondary purposes. Although data utility would be high, the risk level would also be very high. And, as such, it would not be compliant under the HIPAA Privacy Rule, nor anonymized under the GDPR.
Here, we often see a significant negative impact on data utility. Although the resulting data is HIPAA-compliant, risk management is moderate (and can be low in some contexts).
Furthermore, many other fields are untouched by this method, anything beyond the 18 different identifiers that must be removed from data, so that these other fields are still high risk. HIPAA Safe Harbor should not be used outside the HIPAA jurisdiction of PHI.
Additionally, note that the data would not be considered fully anonymized under other regulations such as GDPR. Why? As this link to an expert analysis comparing HIPAA Safe Harbor and Expert Determination explains, indirect identifiers can be used to identify individuals through HIPAA Safe Harbor.
With this approach, data utility remains high while the risk level is low. And, of critical importance, the approach is both HIPAA-compliant and anonymized under GDPR (and most other privacy regulations globally) when the appropriate assessments have taken place and properly documented in a defensible compliance report.
This is because a risk-based approach measures two critical aspects:
Higher-risk contexts require greater data transformations (data disruption/modification to reduce identifiability), whereas lower-risk contexts require less. The risk-based calibration preserves utility while ensuring privacy.
The opportunity also exists to improve data utility through improved technical and organizational controls, which provide a lower-risk context.
Actual transformations depend on the level of completeness of the dataset as well as context.
These illustrations are intended to help show why a risk-based approach makes the most sense for organizations seeking to unlock maximum value from data while providing full compliance with privacy laws.
In some circles, there’s a perception that risk-based methods are difficult to implement. Although the underlying statistical methods can be complex, and manual implementation can be onerous and unrealistic in many cases, practical solutions do exist today. There are open methodologies, training programs (e.g., HITRUST), and commercial software for scale and automation.
How do you make your best choice among anonymization approaches?
As the data privacy landscape changes, being proactive is the winning strategy. Leading organizations around the globe are adopting best practice frameworks such as Privacy by Design. In fact, adoption of the Privacy by Design framework is now required by GDPR.
These leading organizations know that maximizing data value with HIPAA and GDPR compliance through anonymization pays off, in terms of innovation, efficiency and revenue.
So where do more and more of these leaders go for guidance regarding data anonymization? The answer is Privacy Analytics.
Our executives, and our global team of data scientists and business analysts, are the trusted third-party experts for healthcare, IT and diversified companies worldwide.
Finally, as a primer on data de-identification, we recommend the recent White Paper by Dr. Khaled El Emam and Luk Arbuckle, The Five Safes of Risk-Based Anonymization.