USE OF MACHINE LEARNING TO IMPROVE THE EFFICIENCY OF FINDING RELEVANT RESEARCH IN LARGE DATA SETS

L. Ng1,2, K. Chai3
1Curtin University, School of Allied Health, Perth, Australia, 2Curtin University, Curtin enAble Institute, Perth, Australia, 3Curtin University, School of Population Health, Perth, Australia

Background: There is an increasing number of evidenced based articles published every year in scientific and peer reviewed journals. This has made it more time consuming and costly to synthesise evidence in conducting systematic reviews and meta-analyses. Modern tools applying machine learning methods have been developed to help streamline processes for finding relevant research papers more efficiently to save time and money.

Purpose: The aim of this study is to ­evaluate the potential workload saving of using Research Screener, a semi-automated machine learning tool to facilitate abstract screening for researchers on published systematic reviews that involved screening large numbers of abstracts.

Methods: A retrospective analysis was conducted on 3 published systematic reviews with over 10,000 abstracts. Research Screener is a tool that uses machine learning and natural language processing to analyse abstract text and rank them based on their relevance to the research question. It initially ranks unread abstracts by requiring the researchers to provide pre-identified abstracts relevant to their review, which are termed seed abstracts. Researchers then read the ranked abstracts in sets of 50, with the algorithm re-ranking the unread abstracts after every set based on abstracts included and excluded by the reviewer. The aim of this study was to evaluate the number of abstracts that the researchers do not have to read by calculating the Work Saved over random Sampling metric by finding all the relevant abstracts (WSS@100) for a systematic review. We also estimate the time and cost saving this system may offer for these studies in reducing abstract screening time.

Results: Three studies that had a sample of 23,423, 13,376 and 13,028 abstracts were analysed in this study. The WSS@100 was 95%, 87% and 93% respectively. This would equate to researchers and only reading 1,172 abstracts, 1,739 and 881 abstracts to find all relevant research articles to answer the research question. With an average person reading at approximately 250 words per minute, which is also assumed to be the average length of a peer reviewed journal abstract, this would equate to a time saving of 371 hours, 194 hours and 201 hours respectively for the 3 review papers per researcher that performs screening. At an average salary of $26 per hour for a post-doctoral researcher, this is an estimated saving of $9,646 USD, $5,044 USD and $5,250 USD for each researcher performing abstract screening.

Conclusions: Research Screener was able to reduce research time and cost in finding relevant research papers that answer the research question by reducing the time spent reading irrelevant abstracts. This tool may offer greater savings for studies involving larger numbers of abstracts.

Implications: Tools such as Research Screener can allow researchers to conduct systematic and scoping reviews that were previously not possible due to the limited resources. It can also allow reviews to be completed in a timelier manner. E.g. offers an approach for time poor clinicians to conduct evidence-based research and practice.

Funding acknowledgements: Dr. Ng was funded to travel to the WCPT through a research award through Business Events Perth's 2021 Aspire Awards.

Keywords:
Machine learning
Abstract screening
Systematic reviews

Topics:
Innovative technology: information management, big data and artificial intelligence
Research methodology, knowledge translation & implementation science
Education: continuing professional development

Did this work require ethics approval? No
Reason: This is a secondary analysis of a large data sets from systematic reviews. No human or animal interaction was used in this study. This study describes an innovative way in which established methods have been adapted to meet the changing needs of practice.

All authors, affiliations and abstracts have been published as submitted.

Back to the listing