How Data Science Helps Predict Student Withdrawals

King’s College London

Predicting Student Withdrawals with Data Science

Case Studies | 29 July 2021

The Challenge

Student withdrawal is a huge challenge for the higher education sector

Today, on average 7 out of every 100 undergraduate students in the UK drop out of their higher education course before their second year. As more students embrace remote learning, it is becoming increasingly difficult for higher education institutions to identify students who are struggling. It is crucial to be able to identify these students and to make timely interventions to improve student retention, performance, and wellbeing.

Establishing a reliable solution to this problem that doesn’t infringe upon the privacy of individuals is challenging, as is predicting the complexities of when and why students will decide to leave their courses. The decision is often influenced by several contributing factors, including academic (progress, feedback), social (peer group, engagement) and external (family, health, financial).

For King’s College London there are significant student, reputational and commercial benefits in being able to establish an early warning system that would enable staff to identify students who require additional support. King’s and Telefónica Tech have a successful and long-standing relationship developing Data Platforms. Based on our partnership, we agreed to work together to see if training a machine learning model using data within the platform could provide an accurate and effective predictive solution.

The Approach

Proof of Concept for King's College London

An effective proof of concept (POC) exercise is a necessary first step to prove or disprove whether a machine learning solution is viable.

It is advantageous to prove (or disprove) whether required predictions can be accurately generated from the available data quickly, and with a minimum of cost, before committing to a full investment in a machine learning solution. It can often turn out that data isn’t accessible, there isn’t enough of it or that even if the data does support the required predictions, that this can’t be generated in actionable timescales.

The project team had to be focussed, efficient in their use of time and effective in delivery of reliable results. Telefónica Tech’s experience and proven machine learning development approach offered the highest likelihood of success. What ethical considerations need to be made before embarking on such a project? Often where a model’s focus relates to individuals, questions of privacy and consent must be considered. Do subjects consent to the use of their data for the intended purpose?

Also, do any of the key features of the model involve sensitive data about individuals? Student activity and demographics provide key data points to a predictive model of this type. Achieving the fine balance to utilise these in an ethical way that complies with privacy guidelines like GDPR, whilst also meeting with the consent of subjects, would be a critical success factor.

Director of Analytics, King's College London

Richard Salter

"King’s wants to support every student to reach their potential and achieve their ambitions. Where students are disengaging or experiencing difficulties, speed is of the essence. If we can identify the issue early and ideally predict it before it happens, then we are much better able to support the student. If we are slow to identify and respond, then the chances of redressing the situation are drastically reduced."

The Approach

Project Approach

Telefónica Tech applied our proven AI POC approach to collect, analyse, and prepare data and then to train and evaluate candidate models quickly and rigorously. The required data was identified, sampled, cleaned, and analysed to identify any anomalies or gaps. Once the data was understood, work began to identify and trial candidate features which might help to accurately predict a withdrawal outcome. Different combinations of feature sets and algorithms were explored, analysed, and evaluated. We worked very closely with their King’s counterparts to ensure that ethical considerations were sufficiently considered during the scoping and planning of the work, and measures were agreed and implemented to ensure the privacy and security of data processing throughout.

Student Withdrawal Project Approach

This is our standard approach for a data science project that covers all they key components to ensure a successful implementation. The Telefónica Tech AI team conducted two, iterative model build phases. The first proved that machine learning could accurately predict student withdrawal, however the predictions it delivered weren’t timely. Meaning they couldn’t highlight the risk of a student withdrawing with sufficient lead time to allow King’s to make a meaningful intervention. Telefónica Tech and King’s reviewed the situation and quickly agreed the scope of a second, short POC phase. For the second iteration, an additional data source was made available that offered insight into a student’s engagement with King’s online learning system. Data points from this system included number of logins, interactions with forums and groups and assessment submissions. These additional activity-related data points provided far greater insight into student engagement throughout the academic year. The model that was trained subsequently delivered both accurate and more timely withdrawal indicators.

The Solutions

AI Pilot Delivery

Ready to move fast and prove value? Telefónica Tech’s AI Pilot Delivery enables rapid deployment of AI solutions with enterprise-grade resilience. From feasibility assessment to agile delivery and adoption planning, we help you build pilots that are not only innovative but also scalable and sustainable.

Discover AI Pilot Delivery

AI Solution Envisioning

With the explosion of Generative AI, identifying the right use cases is more critical than ever. Telefónica Tech’s AI Solution Envisioning helps you uncover, assess, and prioritise opportunities that align with your business goals. Our structured workshops and expert guidance ensure your AI journey starts with clarity, feasibility, and measurable value.

Discover AI Solution Envisioning

Databricks

As a certified Databricks Partner, Telefónica Tech delivers data, analytics and AI solutions built on the Databricks Lakehouse Platform. We support organisations with the design, deployment and optimisation of Databricks to meet business and technical requirements. Our teams combine data engineering, data science and cloud expertise to deliver scalable and secure Databricks environments.

Discover our Databricks partnership

The Outcomes

Results through precision data

Telefónica Tech proved that probability indicators of student withdrawal could be delivered in a timely manner to key King’s staff members with a degree of accuracy that provides the confidence for them to act upon the insights. The most recently trained classification model reached an overall accuracy of 92 percent. Rarely though is Accuracy alone an indicator of success and so we optimised the models for Precision, which aims to reduce the number of false positives. This is so that the King’s team could begin to focus on a condensed cohort of students with the final model producing a Precision score of 98 percent.

The solution incorporated a trained model as well as the data pipelines to enable predictions to be inferred on an ongoing basis. Furthermore, a way of working on future machine learning problems was proven. This was all achieved in a few weeks for minimal investment.

Director of Analytics, King's College London

Richard Salter

"The results of the different models from the proof of concept were well beyond our expectations. They unambiguously affirmed the potential of this solution and very quickly we moved into thinking about how we could bring it into full production and leverage the value from the insights the model provided."