If your organization is focused on maximizing data value, while also protecting data as a core asset, you may be on the horns of a privacy dilemma. Why?
As an analytics leader or Chief Data Officer, you may be focused on leveraging your organization’s sensitive data assets to drive business impact — you’re likely looking at the potential of personal data.
As a Chief Privacy Officer or Data Protection Officer, your primary focus is likely the protection of personal data.
Whether you’re a CDO or CPO, in the context of U.S. healthcare, you could be up to your neck in HIPAA. And if you operate in the EU as a CDO, CPO, or DPO, or hold data of European residents, you’re in a similar situation with GDPR.
Three big data value questions
You’re likely consumed by three big questions surrounding data value:
1. How can we mitigate our legal exposure?
2. What’s the best way to unlock value from personal data—without destroying its utility through anonymization?
3. How can we increase our access to valuable data held by external suppliers?
To answer those three important questions, you’ll be looking for three things in this article:
- Learn how to bridge the gap between data protection and data value.
- Find out how anonymization can unlock the most business value from personal data.
- Understand the pros and cons of approaches to anonymization as they relate to driving business value of information (BVI) and cost value of information (CVI).
To help you achieve these three learnings, you’ll want to review the comparison chart below, which offers side-by-side comparisons to assist you in aligning the best anonymization approach to your specific business needs.
The need to focus on both offence and defense
If you’re a CDO, CPO, or DPO, you’re in a tricky position. The CDO advocates the effective use of data, while the CPO or DPO is mandated to support privacy. You need to work together. You need a strategy to unlock more value from personal data and, at the same time, remain judicious about protecting data privacy.
To use a sports analogy, as coaches of your team, you need to be equally focused on defense as on offence.
With all the compliance requirements, do you sometimes feel like you’re grinding it out on defense, and making little progress on your value-generating, business-transforming goals? Or perhaps, alternatively, you’re finally generating value—making tangible gains—but are kept awake at night by the fear you’re not fully compliant?
Learn what’s in the winners’ playbook
How about a glimpse into the playbook showing real business problems that some of the world’s top companies are solving with Privacy Analytics? Organizations with a leading record for maximizing data value, such as these:
- A top tier revenue cycle management company whose use of Privacy Analytics’ risk-based de-identification software resulted in healthcare providers improving the quality and efficiency of patient care.
- A multinational conglomerate with $30 billion in annual sales that worked with Privacy Analytics to anonymize large datasets, enabling revenue growth through information-related initiatives such as third-party research and analytics, as well as healthcare network improvements.
- A U.S.-based private research university that was able to maintain compliance while receiving reimbursements by insurance companies and the federal government.
These organizations are using effective anonymization techniques to not only commercialize sensitive data, but also to successfully employ data internally—reducing operational processes and taking full responsibility for sensitive data.
The common denominator for these teams is that they all leverage the added value beyond anonymization, with a repeatable process that minimizes subjectivity and can be operationalized successfully.
The bottom line: a win-win for data privacy and business value.
How to adopt an effective anonymization strategy
Many of the new, innovative and value-driving uses of data are enabled by anonymization because consent or authorization is impractical—sometimes even impossible—under privacy laws. Therefore, you need to be committed to anonymization to drive value.
However, if anonymization isn’t done well, several things can happen:
1) You risk non-compliance with GDPR, non-compliance with HIPAA,
2) Legal exposure, a negative impact on trust, and brand damage,
3) You destroy the utility of data during the anonymization process.
That’s why you need an effective anonymization strategy: one that has operationalized a contextual evaluation of risk.
Let’s look at the two prevailing approaches to data privacy: risk-based and rules-based de-identification.
Risk-based versus rules-based approaches
When it comes to personal data, personally identifiable information (PII) and protected health information (PHI), producing a high-value dataset that meets specific needs must address privacy concerns. Doing so requires organizations to de-identify personal information using a risk-based approach that goes beyond simple masking techniques.
For a better understanding of why successful organizations choose a risk-based approach, consider the graphics later in this article. These graphics illustrate the major differences between the main approaches in a HIPAA context, which applies to U.S. companies working with PHI.
It’s worth noting that with the advent of GDPR, the bar has been raised. For companies who operate beyond the U.S., HIPAA-compliance is no longer ‘good enough’. And as new legislation is proposed and enacted, what was done yesterday to protect personal data may not be good enough tomorrow.
If you are in the healthcare sector, you already know HIPAA Safe Harbor, a rules-based approach that outlines 18 different identifiers that must be removed from data that will be used for secondary purposes.
These examples are:
- Name
- Address (all geographic subdivisions smaller than state, including street address, city county, and zip code)
- All elements (except years) of dates related to an individual (including birthdate, admission date, discharge date, date of death, and exact age if over 89)
- Telephone numbers
- Fax number
- Email address
- Social Security Number
- Medical record number
- Health plan beneficiary number
- Account number
- Certificate or license number
- Any vehicle or other device serial number
- Web URL
- Internet Protocol (IP) Address
- Finger or voice print
- Photographic image – Photographic images are not limited to images of the face.
- Any other characteristic that could uniquely identify the individual
HIPAA Expert Determination, on the other hand, is a risk-based approach that applies statistical or scientific principles to provide a very small, quantifiable risk that an anticipated recipient of the data could identify an individual. This approach also requires that the methods and results of the analysis are documented in a defensible compliance report.
Here, we’re comparing HIPAA Safe Harbor (a rules-based example that is easy to visualize) with HIPAA Expert Determination, which a risk-based example. To further simplify the illustration, we are only presenting a subset of data elements, and a small sample of records.
First, consider this example of raw data.
Please note that this example has been created for demonstration purposes only, and is not intended to reference actual persons.
Raw personal data can’t be used for most secondary purposes. Although data utility would be high, the risk level would also be very high. And, as such, it would not be compliant under the HIPAA Privacy Rule, nor anonymized under the GDPR.
Next, an example of a HIPAA Safe Harbor (rules-based) approach:
Here, we often see a significant negative impact on data utility. Although the resulting data is HIPAA-compliant, risk management is moderate (and can be low in some contexts).
Furthermore, many other fields are untouched by this method, anything beyond the 18 different identifiers that must be removed from data, so that these other fields are still high risk. HIPAA Safe Harbor should not be used outside the HIPAA jurisdiction of PHI.
Additionally, note that the data would not be considered fully anonymized under other regulations such as GDPR. Why? As this link to an expert analysis comparing HIPAA Safe Harbor and Expert Determination explains, indirect identifiers can be used to identify individuals through HIPAA Safe Harbor.
And, finally, a risk-based Expert Determination example:
With this approach, data utility remains high while the risk level is low. And, of critical importance, the approach is both HIPAA-compliant and anonymized under GDPR (and most other privacy regulations globally) when the appropriate assessments have taken place and properly documented in a defensible compliance report.
This is because a risk-based approach measures two critical aspects:
1. Data Risk: A function of the dataset characteristics
2. Context Risk: A function of where and how the de-identified data will be disclosed/used.
Higher-risk contexts require greater data transformations (data disruption/modification to reduce identifiability), whereas lower-risk contexts require less. The risk-based calibration preserves utility while ensuring privacy.
The opportunity also exists to improve data utility through improved technical and organizational controls, which provide a lower-risk context.
Actual transformations depend on the level of completeness of the dataset as well as context.
These illustrations are intended to help show why a risk-based approach makes the most sense for organizations seeking to unlock maximum value from data while providing full compliance with privacy laws.
Complex algorithms; practical solutions
In some circles, there’s a perception that risk-based methods are difficult to implement. Although the underlying statistical methods can be complex, and manual implementation can be onerous and unrealistic in many cases, practical solutions do exist today. There are open methodologies, training programs (e.g., HITRUST), and commercial software for scale and automation.
How do you make your best choice among anonymization approaches?
As the data privacy landscape changes, being proactive is the winning strategy. Leading organizations around the globe are adopting best practice frameworks such as Privacy by Design. In fact, adoption of the Privacy by Design framework is now required by GDPR.
These leading organizations know that maximizing data value with HIPAA and GDPR compliance through anonymization pays off, in terms of innovation, efficiency and revenue.
So where do more and more of these leaders go for guidance regarding data anonymization? The answer is Privacy Analytics.
Our executives, and our global team of data scientists and business analysts, are the trusted third-party experts for healthcare, IT and diversified companies worldwide.
Privacy Analytics’ client success stories include
- One of the world’s largest pharmaceutical multinationals was enabled to fulfill EMA (European Medicines Agency) Policy 0070 document releases.
- A software company using artificial intelligence to derive diagnoses from medical images sourced from a variety of hospital partners and hardware configurations, was enabled to apply the process of de-identification to the onboarding of new machines.
- A global provider of pre-operative planning software and intra-operative surgical robots, was enabled to implement Privacy and Security controls to ensure data recipients can manage data access and use appropriately.
Complex algorithms; practical solutions
Finally, as a primer on data de-identification, we recommend the recent White Paper by Dr. Khaled El Emam and Luk Arbuckle, The Five Safes of Risk-Based Anonymization.