Cloud computing

Active learning (AL) can be a powerful tool to streamline systematic reviews by prioritizing relevant studies for screening. However, evaluating the performance of various AL models requires extensive simulation studies, which can be computationally intensive and time-consuming. To address this challenge, the ASReview team has developed a cloud-based infrastructure that enables efficient, scalable, and reproducible simulations.
Progress
The team developed an architecture design with a multiprocessing computational strategy for ASReview Makita (Make It Automatic) templates, which helps run simulation studies mimicking the screening process of active learning (AL) aided systematic reviews. The resulting paper provides a technical explanation of the proposed cloud architecture and its usage. In addition to that, we conducted 1140 simulations investigating the computational time using various numbers of CPUs and RAM settings.
Our analysis demonstrates that, whilst individual simulations cannot be sped up, using multiprocessing to run several individual simulations at the same time saves the user a lot of time. The total time saved depends on the number of simulations and the size of the dataset, but if each simulation uses the same data, the time saved scales linearly with the number of used cores. E.g. 10 simulations of an hour would take 10 hours on a machine with 1 CPU core, but only 1 hour when using multiprocessing on a machine with 10 CPU cores.
The results obtained from our analysis can contribute to the research process of future studies in the field of assisted systematic review pipelines: first, as software for optimal simulations; second, as a basis for following studies of multiprocessing effects on simulations’ speedup. Overall, the use of parallelization techniques in the simulation of AL pipelines has the potential to enhance computational efficiency greatly.
Funding
This project was supported by the Netherlands eScience Center under grant number .