Re-identification Risk Determinations (RRDs) are assessments where an expert applies generally accepted statistical methods to determine how identifiable individuals are in a dataset. The expert will recommend necessary changes to de-identify or anonymize the data and may act as a Trusted Third Party (TTP) to apply those changes.
RRDs are aligned with the Expert Determination method under the Health Insurance Portability and Accountability Act (HIPAA) and assessments of identifiability under the General Data Protection Regulation (GDPR).
Experts performing these assessments pay a lot of attention to the statistical methods used. However, several non-technical factors can increase (or decrease!) the likelihood that a data privacy initiative is successful.
Privacy Analytics has been supporting clients and partners with RRDs since 2007. In that time, we’ve discovered several factors that can significantly impact your organization’s ability to reach its desired outcomes. Drawing from those, here is a summary of 10 best practices for RRDs:
- Partner with your de-identification or anonymization expert to build their familiarity with your data, use case, and stakeholders. A partnership can streamline ongoing assessments, helping improve the efficiency and effectiveness of continuing collaboration.
- Challenge your organizational assumptions around what types of data can or can’t be de-identified or anonymized and what the resulting data will look like. Technology and capabilities evolve rapidly, and increasingly complex data privacy projects like tokenization, unstructured text, or imaging data are now quite tractable.
- Plan for anticipated changes in data, data flows, end users, or environments – an RRD may be able to accommodate planned scenarios that aren’t yet in place. You can also consider setting up RRDs that can be adaptable or amendable to cover foreseen (or unforeseen!) changes in the scope of the de-identified or anonymized data sharing.
- Identify similar data-sharing scenarios, which can be opportunities to streamline the assessment of multiple scenarios under a single RRD. This can reduce effort, time, and cost in documenting de-identification or anonymization approaches.
- Assemble the right project team to support an RRD. The involvement of data experts, legal/governance representatives, and an end user ensures that the project is well-informed about the nature and provenance of the data, the regulatory and contractual requirements, and the ultimate needs of the de-identified or anonymized data.
- Understand where the data is going, as the likelihood of a re-identification attempt occurring and being successful is directly informed by the end user, destination environment, and co-located data assets therein.
- Know your business priorities, as RRDs can be tailored for speed, affordability, or flexibility. The appropriate balance of emphasis will be specific to the needs of your organization.
- Understand the needs of end users, who will ultimately derive value from the de-identified or anonymized data. The fields and data granularity necessary are driven by end-user needs, which can vary from case to case, so early alignment will help avoid expensive and time-consuming rework.
- Work with clean data where possible. While designing a pipeline for de-identified or anonymized data, positioning data cleanup upstream of the RRD will streamline the RRD process and ensure data quality. This can be either independently of RRD or as part of a TTP function bundled with an RRD to provide private and curated data output.
- Keep track of the RRD expiry to avoid being caught flat-footed and ensure there are no gaps in defensibility documentation.
These best practices can enable you to streamline the Re-identification Risk Determination process, reducing costs, increasing efficiency, and improving your return on investment in data privacy.
Contact an expert at Privacy Analytics to discuss Re-identification Risk Determinations and how the right approach can provide a smoother, higher-value experience for your organization.