The Rise of Big Data in Healthcare
Over the past decade, a metamorphosis has been taking place in how health information is collected and stored. Gone are the days of written scripts and paper charts, replaced by real-time medical monitoring and the electronic health records (EHRs) of today.
- Wireless monitoring devices use telemetry to remotely track and report back on a patient’s blood pressure, heart rate or blood glucose levels.
- Digital imaging technologies have become more affordable, resulting in ultrasounds, endoscopies, MRIs and PET scans being used commonly to support early detection and diagnosis. Medical images are the largest contributor to the expanding volume of big data in healthcare.
- EHRs capture longitudinal data on patients to follow changes over time. Thanks to government initiatives like the HITECH Act, they have been broadly adopted. In 2013, 78% of office-based physicians in the U.S. had implemented an EHR, up from 17% ten years earlier.
Along with these changes to the clinical record, health data is no longer captured solely at the point of care. Increasingly, it is being augmented by information coming from patient activities like fitness tracking devices, mobile applications, social media posts and internet searches. Also contributing to this expanding digital data universe are other sources of data like gene sequencing, medical research findings, practice guidelines, and administrative data from billing and insurance claims. In fact, healthcare is one of the fastest growing segments of digital information on the planet, with an estimated annual growth rate of 48%. At this rate, it is expected that the amount of worldwide healthcare data will reach about 2,000 exabytes — or 2 trillion gigabytes — by 2020.
The enormous quantity of information being produced, the speed at which it is being generated and the mix of formats in which it is captured are the three defining characteristics of big data – volume, velocity and variety.
A fourth ‘V’ is also noted sometimes when discussing big data in healthcare – veracity. Veracity, or data quality, refers to the fact that the analysis done on the data is credible and error-free. The quality of healthcare data, particularly unstructured data, can be highly variable. In addition to issues typically seen in text data, like typos, medical text uses many abbreviations, has numerous variants to express some health conditions, and can see wide differences in the level of detail provided in practitioner notes. This makes the de-identification of medical text less clear-cut than other forms of data. Effective text anonymization requires expertise not only in masking and de-identification but also in the unique characteristics of this lexicon.
The Rise of Big Data in Healthcare is the second in the Big Data Analytics Series by Privacy Analytics. Next: Challenges in Big Data Analytics.
- Turn Data Assets into Business Opportunity Under CCPADecember 19, 2019
- How does risk-based anonymization work?December 18, 2019
- Why should I use Expert Determination over Safe Harbor?December 18, 2019
- What do I need to know about GDPR, HIPAA and CCPA to meet our regulatory and privacy obligations?December 18, 2019
- Should we invest in building our own de-identification capability?December 17, 2019
- GDPR and The Future of Clinical Trials Data SharingMarch 18, 2019
- Advancing Principled Data Practices in Support of Emerging TechnologiesMarch 15, 2019
- “Zero Risk Does Not Exist”February 7, 2019
- Is Anonymization Possible with Current Technologies?January 9, 2019
- Comparing the benefits of pseudonymisation and anonymisation under the GDPRDecember 20, 2018