The First Step Towards Sharing Big Data

If you are planning on sharing health data for secondary purposes, the first step towards sharing Big Data is de-identification. As previous blog posts have shown, not all de-identification techniques are the same. When it comes to leveraging Big Data, risk-based de-identification maintains the granularity needed for research. Finding patterns and trends in data require that the dataset has retained its data quality.

Maintaining this quality is especially crucial as we see more and more data is aggregated from various sources and linked together. When it comes to using the data for research, analytics and other secondary use, demand will be greatest for those data repositories that make the most of linked data. This includes:

1)            Aggregated data from healthcare providers around the country so that a greater portion of the population is represented; and,

2)            Connected data from EHRs and disease registries with other administrative, pharmaceutical and claims data to provide a greater amount of detail on a patient’s care experience.

In the second example – the comprehensive view of the patient’s journey – this creates concerns around patient confidentiality. The more information available about an individual, the more likely it is that identifiers within the data could be used to re-identify that person. If a dataset contains information about a stigmatizing health condition or other highly sensitive information, it could have devastating consequences for a patient. The patient would not be the only one to suffer. It could also result in actions being taken against the data owner for failing to take appropriate precautions to prevent a privacy breach.

De-identification is a Risk Management Exercise

The responsible sharing of healthcare data begins by taking steps to ensure that protected health information (PHI) that could be used to identify a patient is handled appropriately. De-identifying data renders it anonymous so that it can be used safely. Effective de-identification requires a statistical approach to mitigate risks to patients and data owners. It is a critical step to take advantage of the opportunities provided by Big Data Analytics.

Looking to learn more about de-identification for Big Data Analytics? Make sure to read our latest white paper, Unlocking Big Data for Healthcare.

Free Webinar: De-Identification 101

Join Privacy Analytics for a high level introduction of de-identification and data masking.
Watch now

Free Download: De-Id 101

You have Successfully Subscribed!