Optimizing ASReview Simulations

The ever-growing volume of scientific publications presents a significant challenge for researchers aiming to conduct systematic reviews. While machine learning (ML), and specifically active learning pipelines, have shown great promise in optimizing the screening process, running large-scale simulation studies to evaluate different ML models remains technically demanding.

Sergei Romanov’s MSc project focused on addressing this challenge. He proposed a multiprocessing solution for ASReview Makita templates, designed to divide large and complex sets of simulations into independent parts, thereby optimizing computation time. His work introduces parallel and distributed computing techniques within cloud environments, enabling researchers to run extensive simulation studies more efficiently and with less computational burden.

Progress

By leveraging parallel computing, containerization, and orchestration technologies, Sergei designed an architecture that allows simulation tasks to run concurrently on both local and virtual machines. Unlike traditional sequential workflows that execute simulation commands one after another, this multiprocessing approach reduces queuing time and scales with the availability of CPU resources.

In his study, Sergei conducted extensive benchmarking of Makita’s ARFI template across varying CPU and memory configurations, demonstrating how multiprocessing can significantly accelerate simulation studies without compromising reproducibility. His work provides a technical guideline for data scientists on setting up scalable simulation workflows using ASReview.

Beyond improving computational efficiency, Sergei’s solution contributes to the development of more sustainable research practices. By reducing overall processing time, the approach aligns with emerging ‘green’ cloud computing paradigms that advocate for energy-efficient computational strategies.

Funding

This work was supported by the Netherlands eScience Center under grant number ODISSEI.2022.023

People involved

More info