As an analytics leader or Chief Data Officer, you may be focused on leveraging your organization’s sensitive data assets to drive business impact — you’re likely looking at the potential of personal data.
As a Chief Privacy Officer or Data Protection Officer, your primary focus is likely the protection of personal data.
Whether you’re a CDO or CPO, in the context of U.S. healthcare, you could be up to your neck in HIPAA. And if you operate in the EU as a CDO, CPO, or DPO, or hold data of European residents, you’re in a similar situation with GDPR.
You’re likely consumed by three big questions surrounding data value:
To help you achieve these three learnings, you’ll want to review the comparison chart below, which offers side-by-side comparisons to assist you in aligning the best de-identification anonymization approach to your specific business needs.
If you’re a CDO, CPO, or DPO, you’re in a tricky position – sitting on the opposite side of the table to each other. The CDO advocates the effective use of data, while the CPO or DPO is mandated to adhere to privacy regulations and requirements. You need to work together. You need a strategy to unlock more value from personal data and, at the same time, remain judicious about protecting data privacy.
To use a sports analogy, as coaches of your team, you need to be equally focused on defense as on offence.
With all the compliance requirements, do you sometimes feel like you’re grinding it out on defense, and making little progress on your value-generating, business-transforming goals? Or perhaps, alternatively, you’re finally generating value—making tangible gains—but are kept awake at night by the fear you’re not fully compliant?
How about a glimpse into the playbook showing real business problems that some of the world’s top companies are solving with Privacy Analytics? Organizations with a leading record for maximizing data value, such as these:
These organizations are using effective anonymization techniques to not only commercialize sensitive data, but also to successfully employ data internally—reducing operational processes and taking full responsibility for sensitive data.
The common denominator for these teams is that they all leverage the added value beyond anonymization, with a repeatable process that minimizes subjectivity and can be operationalized successfully.
The bottom line: a win-win for data privacy and business value.
Many of the new, innovative and value-driving uses of data are enabled by anonymization because consent or authorization is impractical—sometimes even impossible—under privacy laws. Therefore, you need to be committed to anonymization to drive value.
However, if anonymization isn’t done well, several things can happen:
That’s why you need an effective anonymization strategy: one that has operationalized a contextual evaluation of risk.
Let’s look at the two prevailing approaches to data privacy: risk-based and rules-based de-identification.
When it comes to personal data, personally identifiable information (PII) and protected health information (PHI), producing a high-value dataset that meets specific needs must address privacy concerns. Doing so requires organizations to de-identify personal information using a risk-based approach that goes beyond simple masking techniques.
For a better understanding of why successful organizations choose a risk-based approach, consider the graphics later in this article. These graphics illustrate the major differences between the main approaches in a HIPAA context, which applies to U.S. companies working with PHI.
It’s worth noting that with the advent of GDPR, the bar has been raised. For companies who operate beyond the U.S., HIPAA-compliance is no longer ‘good enough’. And as new legislation is proposed and enacted, what was done yesterday to protect personal data may not be good enough tomorrow.
If you are in the healthcare sector in the U.S., you already know HIPAA Safe Harbor, a rules-based de-identification approach that outlines 18 different identifiers that must be removed from data that will be used for secondary purposes.
These examples are:
HIPAA Expert Determination, on the other hand, is a risk-based de-identification approach that applies statistical or scientific principles to provide a very small, quantifiable risk that an anticipated recipient of the data could identify an individual. This approach also requires that the methods and results of the analysis are documented in a defensible report.
Here, we’re comparing HIPAA Safe Harbor (a rules-based example that is easy to visualize) with HIPAA Expert Determination, which a risk-based example. To further simplify the illustration, we are only presenting a subset of data elements, and a small sample of records.
Please note that this example has been created for demonstration purposes only and is not intended to reference actual persons.
Raw personal data can’t be used for most secondary purposes. Although data utility would be high, the risk level would also be very high. And, as such, it would not be compliant under the HIPAA Privacy Rule, nor anonymized under the GDPR.
Here, we often see a significant negative impact on data utility. Although the resulting data aligns with HIPAA, risk management is moderate (and can be low in some contexts).
Furthermore, many other fields are untouched by this method, anything beyond the 18 different data elements that must be removed from data under Safe Harbor. Anonymization under the GDPR requires that the identifiability of other data elements are assessed and risk is mitigated appropriately.
With this approach, data utility remains high while the risk level is low. And, of critical importance, a risk-based approach can be aligned to both HIPAA de-identification and anonymization under GDPR (and most other privacy regulations globally) when the appropriate assessments have taken place and properly documented in a defensible report.
This is because a risk-based approach measures two critical aspects:
Higher-risk contexts require greater data transformations (data disruption/modification to reduce identifiability), whereas lower-risk contexts require less. The risk-based calibration accounts for the protective effects of closely secured analysis environments, preserving utility while ensuring privacy. As such, the opportunity also exists to improve data utility through improved technical and organizational controls, which provide a lower-risk context.
Actual transformations depend on the level of completeness of the dataset as well as context.
These illustrations are intended to help show why a risk-based approach makes the most sense for organizations seeking to unlock maximum value from data while ensuring alignment with privacy laws.
In some circles, there’s a perception that risk-based methods are difficult to implement. Although the underlying statistical methods used to assess the risk can be complex, risk-based methods can be oriented around practical implementation to focus on simple, understandable outcomes. There are open methodologies, training programs (e.g., HITRUST), and commercial software for scale and automation.
How do you make your best choice among de-identification and anonymization approaches?
As the data privacy landscape changes, being proactive is the winning strategy. Leading organizations around the globe are adopting best practice frameworks such as Privacy by Design. In fact, adoption of the Privacy by Design framework is now required by GDPR.
Leading organizations know that risk-based approaches to de-identification and anonymization pay off, in terms of innovation, efficiency and revenue.
So where do more and more of these leaders go for guidance regarding data de-identification or anonymization? The answer is Privacy Analytics.
Our executives, and our global team of data scientists and business analysts, are the trusted experts for healthcare, life science, medical technology and devices, finance and banking, transportation, communication, advertisement, IT and many other diversified companies worldwide.
Finally, as a primer on data de-identification and anonymization, we recommend The Five Safes of Risk-Based Anonymization whitepaper.