Can we, and should we, predict study delay?

How to responsibly deploy a predictive modelling dashboard for study advisors so they can help students at risk of dropping out or being delayed in their studies sooner? You might think that that is an excellent question. Especially now, because the Dutch government wants to implement a penalty for students who take longer than nominal to finish their bachelors or masters studies.

Recently, on this topic was published in the journal Computers and Education: Artificial Intelligence. We focus on how the dashboard was developed and which steps and considerations played a role in the responsible deployment of a tool that uses artificial intelligence (AI) in an educational environment.

Introduction

One of the tasks of study advisors at Utrecht university (UU) is to assist students who are at risk of dropping out or facing study delay. Being delayed in your studies can have negative consequences, such as a getting a financial penalty (if our government gets its way).

The study advisors at the UU monitor a large number of students and experience a high workload. It has been advocated that AI systems that predict study delay can help study advisors in reducing their workload and make it easier to contact students who are at risk of being delayed much sooner.

Team learning analytics (LA), together with two faculties, collaborated in a pilot to examine whether we can predict study delay to give study advisors meaningful insight into students’ study progress. For the pilot we worked with our LA roadmap to ensure responsible processing of student data. Part of that process is considering all involved stakeholders’ views and carrying out a privacy scan to determine the risk level of the project.

Pros and cons of prediction models

This project, which involves processing personal data, prompted a thorough exploration of stakeholder perspectives, including students, study advisors, and privacy officers. Team LA conducted interviews and held focus groups with these stakeholders. The concerns of the stakeholders could be categorized into three groups: data management, algorithms, and educational implications.

Stakeholders agreed on data risks and the necessity of GDPR compliance, emphasizing authorized access to the data and predictions, and students' rights to opt-out of the predictions.

Algorithm concerns revolved around potential biases of the algorithms and the need for human oversight (human in the loop). Hence, preparatory sessions for study advisors to clarify usage of the dashboard and how to interpret the predictions were organized.

Pedagogically, perspectives varied; while study advisors viewed the dashboard as a supportive tool, students worried about the student - study advisor relationship and the risk of being labeled as "at risk." It was agreed that communication to the students should be careful and supportive, and that an evaluation plan should be in place to periodically monitor and adjust the project if needed.

Based on the privacy scan, it was decided to carry out a Data Protection Impact Assessment (DPIA). The privacy officers concluded that the potential benefits for students outweighed the privacy concerns if the mitigating measures were implemented. The project was approved and ready for implementation. 

Results of the pilot

The prediction model was turned into a dashboard for study advisors, and implemented in practice for nine months. At the end of this period we evaluated the pilot with the study advisors. We learned valuable lessons from this project, particularly about the factors that are involved in making implementation of LA projects in general a success.

In terms of dashboard design, study advisors found the dashboard initially overwhelming and suggested improvements, such as customizable design features.

In terms of the data that formed the input for the predictions, we made use of study results, but not of data at the course level for example. The study advisors indicated that basing the predictions on a wider dataset would help to improve their usefulness. However, even with this initial dataset, some study advisors experienced challenges with interpreting the predictions, especially when they contradicted existing student information (e.g. a student who was on track, but was still predicted at risk of being delayed based on other or a combination of several factors).

Further, their beliefs about predictive modeling varied; some were skeptical while others were open to being surprised by the data model. For example, some study advisors indicated they gained new insights about students and their potential need for support as a result of the dashboard.

Lastly, institutional factors also played a role, such as high experienced workload resulting in insufficient time to properly work with and test the dashboard.

To conclude

The pilot led to valuable insights into the potential of data-informed empowerment of education. Based on the outcomes of the pilot, we think that investigating LA as part of the toolkit for student support is a valuable way forward. Identifying study delay and the factors that cause it could not only inform study advisors, but also for example program directors. They could adjust the curriculum based on that type of information if needed.

At the institutional level, our recommendation would be to further invest in these types of innovation and in the required facilities, including data literacy for all staff and students. Also, investing in further developing the accuracy of prediction models is important for the future.

We hope this blog provides insight into how LA projects are initiated at our university and how we collaborate with a range of stakeholders to try to ensure responsible processing of student data.

Want to read more?

  • We have recently published a scientific article about this project,
  • For a summary of the project on the UU website, see  the project page.