The First Step Towards Sharing Big Data
If you are planning on sharing health data for secondary purposes, the first step towards sharing Big Data is de-identification. As previous blog posts have shown, not all de-identification techniques are the same. When it comes to leveraging Big Data, risk-based de-identification maintains the granularity needed for research. Finding patterns and trends in data require that the dataset has retained its data quality.
Maintaining this quality is especially crucial as we see more and more data is aggregated from various sources and linked together. When it comes to using the data for research, analytics and other secondary use, demand will be greatest for those data repositories that make the most of linked data. This includes:
1) Aggregated data from healthcare providers around the country so that a greater portion of the population is represented; and,
2) Connected data from EHRs and disease registries with other administrative, pharmaceutical and claims data to provide a greater amount of detail on a patient’s care experience.
In the second example – the comprehensive view of the patient’s journey – this creates concerns around patient confidentiality. The more information available about an individual, the more likely it is that identifiers within the data could be used to re-identify that person. If a dataset contains information about a stigmatizing health condition or other highly sensitive information, it could have devastating consequences for a patient. The patient would not be the only one to suffer. It could also result in actions being taken against the data owner for failing to take appropriate precautions to prevent a privacy breach.
De-identification is a Risk Management Exercise
The responsible sharing of healthcare data begins by taking steps to ensure that protected health information (PHI) that could be used to identify a patient is handled appropriately. De-identifying data renders it anonymous so that it can be used safely. Effective de-identification requires a statistical approach to mitigate risks to patients and data owners. It is a critical step to take advantage of the opportunities provided by Big Data Analytics.
Looking to learn more about de-identification for Big Data Analytics? Make sure to read our latest white paper, Unlocking Big Data for Healthcare.
- Can you comply your way to greatness?November 21, 2019
- When to Integrate Anonymization of Documents and DataSeptember 26, 2019
- Deep-Diving into Re-identification: Perspectives On An Article In Nature CommunicationsSeptember 26, 2019
- Learning at Scale: Anonymizing Unstructured Data using AI/MLSeptember 26, 2019
- Early Impact of Health Canada’s New GuidelinesJune 21, 2019
- GDPR and The Future of Clinical Trials Data SharingMarch 18, 2019
- Advancing Principled Data Practices in Support of Emerging TechnologiesMarch 15, 2019
- “Zero Risk Does Not Exist”February 7, 2019
- Is Anonymization Possible with Current Technologies?January 9, 2019
- Comparing the benefits of pseudonymisation and anonymisation under the GDPRDecember 20, 2018