Study Is First to Validate Application of Hybrid OCR/NLP Technology for Large-Scale Data Extraction From Scanned Colonoscopy and Pathology Reports
A study published in the May issue of Gastrointestinal Endoscopy (www.giejournal.org) demonstrates and validates for the first time that optical character recognition (OCR) combined with natural language processing (NLP) technology analyzes scanned procedure and pathology reports accurately and efficiently – eliminating the time and cost of manual data extraction by delivering electronically processable clinical information.
In the retrospective study conducted at Cleveland Clinic, Cleveland, Ohio, and the University of Minnesota, Minneapolis, a randomly sampled list of outpatient screening colonoscopy procedures and pathology reports was selected. Desired variables were then collected. Two researchers first manually reviewed the reports for the desired variables, then the OCR/NLP algorithm was used to obtain the same variables from 3 different electronic health records: Epic, ProVation, and Sunquest PowerPath.
Recommended AI News: NICE Ranks Top of Gartner’s Magic Quadrant in 2021 for Workforce Engagement Management
Among the key results of the study: The OCR/NLP technology extracted desired variables from reports contained in an image format with an accuracy of >95%. Compared with manual data extraction, the accuracy of the hybrid approach to detect polyps was 95.8%, adenomas 98.5%, sessile serrated polyps 99.3%, advanced adenomas 98%, inadequate bowel preparation 98.4%, and failed cecal intubation 99%. A comparison of the dataset collected via NLP alone versus that collected using the hybrid OCR/NLP approach showed the accuracy for almost all variables was >99%.
“The results of this proof-of-concept study create a new frontier in the use of large-scale data extraction from scanned reports, which was previously limited by lack of appropriate technology,” said Maged Rizk, MD, a gastroenterologist and associate director for the Cleveland Clinic Medicare Accountable Care Organization.
Dr. Rizk, lead author of the research, explained that while data shows colonoscopy screening has led to lower colorectal cancer incidence and mortality, increasing evidence suggests that examination quality may impact its effectiveness. The information needed to assess exam quality is often embedded in non-standardized procedure reports of varying formats within EHRs, requiring time-consuming and costly data extraction for accurate reporting. As a result, this information is not readily available for streamlining quality management, participating in endoscopy registries, or reporting of patient- and center-specific risk factors predictive of outcomes.
Recommended AI News: Identity Verification Platform Sumsub Raises $6 Million in Series a From MetaQuotes
“A process which was previously expensive and time-consuming can now potentially be done accurately in a time- and labor-efficient manner,” explained Dr. Rizk, who was among the 11-member team of physicians and researchers that co-authored the study.
The team also included the late Colin Rhodes, former Chief Technology Officer of eHealth Technologies, who colleagues say was “vital” in the development of the OCR/NLP hybrid technology. Mr. Rhodes passed away prior to the study’s publication.
“The contributions of Mr. Rhodes to this collaboration support our company’s mission to provide seamless access to healthcare information. eHealth Technologies is continuously seeking ways to streamline the critical data that physicians need to deliver lifesaving care for their patients,” said Jeff Markin, CEO, eHealth Technologies.
Future multicenter studies elaborating the use of OCR in combination with validated commercially available NLP tools will help substantiate the use of this novel technology on a larger scale – not only for measurement of procedure quality indicators but potentially for multiple other venues in health care as well.
Recommended AI News: Understanding the Role of AI in Gaming
Comments are closed.