Skip to main content

Objectives

Traditional systematic literature review (SLR) methodologies suffer from inefficiency due to their time-consuming and resource-intensive nature. Our goal is to improve this process by developing an efficient Intelligent Systematic Literature Reviews (ISLaR) system, using Generative Pre-Trained Transformer 4 (GPT-4), for real-time article screening and data extraction.

Material and Methods

To evaluate the performance and develop the ISLaR system, we employed three use cases: disease burden SLR in Human papillomavirus infection (HPV), economic burden SLR in Pneumococcal, and clinical trials SLR in endometrial cancer. Studies relevant to each use case were collected from PubMed and Embase using specific keywords. The ISLaR system incorporates all essential components of an SLR process, including study protocol setup, PubMed and Embase queries search, abstract and full-text screening, data extraction, and results summarization and visualization. GPT-4 was employed to enhance the efficiency of abstract screening, full-text screening, and data element extraction. Both abstract and full-text screening components generate the eligibility decision based on specific inclusion and exclusion criteria from study protocols. Data element extraction identifies important data elements. The performance evaluation for abstract screening was conducted on 50 randomly selected articles in each use case, which were labeled by experts.

Results

Endometrial cancer use case was utilized to demonstrate the results. The accuracy of the system in abstract screening was found to be 86%, with precision, recall (sensitivity), and F1 scores of 0.86, 0.94, and 0.90, respectively. In full-text screening, the system achieved an accuracy of 78.95%. Regarding data element extraction, the F1 scores for identifying study details were 0.97 (strict matching) and 0.99 (relaxed matching), while the F1 scores for identifying clinical outcome values were 0.88 (strict matching) and 0.97 (relaxed matching).

Conclusion

The ISLaR platform successfully identified the majority (94%) of articles discovered in the manual search, significantly reducing the time required for the process. Additionally, it demonstrated satisfactory performance in data extraction. These findings indicate the potential of ISLaR for real-time updates and timely insights."

Presenter

Dong Wang, PhD
Merck
Associate Director of Real-world Data Analytics and Innovation