How big data is delivering big outcomes in the education sector

The Department for Education (DfE) has built an interactive dashboard to help schools, governing bodies and local authorities improve levels of pupil attendance. 

Attendance data is being collected from schools nationwide on a voluntary basis that will enable a school to gain better insight from its own data, and for local authorities to view the information of participating schools in their area. By providing a record of up-to-date pupil-level attendance and aggregating patterns of absence, the national, regional and local trends within the education sector can be identified and used to shape future policies.

Spearheading the data elements of the school attendance project is Neil McIvor, Chief Data Officer at the Department for Education, who spoke to Government Transformation Magazine about how the programme got started and the value of big data in driving better policy-making. 

Covid as a catalyst 

The school attendance project initially started as a temporary way of assessing the scale of absenteeism and school closures at the start of Covid-19 by getting schools to fill out an online form. The information collected ended up having a direct impact onimage001-1 government decisions around the spread of the virus and the modelling of lockdown policies.

Since then, the project has scaled rapidly. McIvor and his team worked out that it would be more beneficial if they could pull attendance records from schools over computer systems, rather than relying on an individual to fill in a form. To do this, they partnered with an aggregator company, one who already collected information from a variety of management information systems, and built automated API pipelines from those systems. 

The final product is a near real-time interactive dashboard that has been opened up to participating schools through a secure authentication layer.

The bigger picture 

McIvor and his team are currently pulling morning and afternoon school registers, representing around 14 million records a day. The database is 5 billion records and grows by a quarter of a billion records every month. 

“This is big data in a way that government generally doesn't use big data,” McIvor says. “When we first started pulling data, we realised that we’d never actually seen the big picture before, in terms of school attendance patterns… Suddenly we have all this information that can be used.”

There are a multitude of ways this data can make a difference: it can enable schools and local authorities to look at individual students or year groups over time and can provide early warning signs in students where absences are getting worse. This level of insight has the potential to shape school policies going forward, and has already led to conversations at local level around safeguarding and child welfare, McIvor says. 

So far, 78% of all state maintained schools in the UK have volunteered to be part of the pilot; a strong indicator that the education sector sees tremendous value in having access to that level of insight.   

Building a successful and credible system 

The school attendance project started in January 2022 and went live in September 2022. McIvor attributes its speed and success to having spent time building out the internal capability. “We didn't bring in any external organisations to build this, which means that we have total control of it.”

A large concern was around making sure the system maintained its credibility. The last thing McIvor wanted was for schools to log onto their dashboard "and see the blue circle of death," because then they’ll see a system that doesn't work and lose heart and "as soon as you lose the hearts and minds, the whole project loses credibility."

This meant ensuring the performance of the system was up to scratch, McIvor explains. “My biggest focus is the stability and the performance of the public or the school facing dashboards, the needs of the department has got to come secondary to that. If our analysts have to take a day to get into it, I'd rather that happen than a school not being able to access the systems.”

Navigating a complex data environment

Dealing with a disparate ecosystem of data within the UK education sector, made up of around 24,500 schools in England - all of which are individual legal entities and host their own data on different systems - was one of the biggest challenges to overcome. 

“Having to ingest from multiple information systems that I have absolutely no control over how they build and maintain it, and thinking about how we could hook our API's onto the different systems was one of the biggest challenges. This is why we went with the aggregator company,” McIvor says. 

He adds: “We also did some big deep dive architectural reviews and made sure to have some of those quite hard internal discussions about prioritisation as we went.”

Overcoming the challenges surrounding data protection and security meant ensuring that all of the right legal gateways and data privacy impacts were thought through at a very early phase. The project was co produced with DfE’s attendance policy team, overseen by Sophie Taylor, Director of Vulnerable Children Strategy and Educational Engagement, to ensure that there have been the right policy skills in play. These have included understanding the role of data in attendance policy, aligning the project with Ministerial priorities, programme management, and stakeholder management. 

“We've been in constant dialogue with the unions, teachers, directors of children's care, and also with interesting parties in other parts of government, all of whom are interested in seeing this project is carried out correctly,” McIvor says. 

A lesson in cross-sectoral data sharing

The school attendance project offers an important lesson in cross-sectoral data sharing.

You can't run digital transformations without data transformations and you can't evaluate government policies without being able to have interoperable data across systems; this is important in understanding not just what's happening, but also what's working,” McIvor explains. “Very few policies nowadays sit within one government department so we need to be able to share data across government departments and manage it."

McIvor has the “unique advantage” of also leading the data protection function within the DfE, which he admits is “quite unusual”. Typically, this function sits alongside the legal or governance team, leading data to be viewed more typically in a strictly protective context. Seeing data as an enabler, rather than as a protective measure or a blocker, requires both a cultural and organisational shift within government departments.

“By embedding the data protection team alongside the engineers who are building the solution, it helps unlock the benefits to the pupils by considering their needs at every stage, alongside the needs of the organisations who are using data.”

Strike action and next steps 

On 1st February, the flexibility built into the design enabled McIvor to automatically pull attendance registers every hour throughout the day - some 76.5 million records -  to build a bigger picture around the impact of the teacher’s strike. 

The official statistics detailing the number of schools open, partially open or fully closed as a result of the industrial action, were published that same afternoon and have been used throughout the national media. McIvor says having this information in the public domain allows people to debate issues without arguing over the facts. 

McIvor and his team are now focused on scaling the project. Next steps involve enabling a functionality in the dashboard whereby schools can benchmark attendance against other schools nationwide. This phase of the project is expected to be launched in February

“It's all very well seeing how individual schools are doing, but they are going to want to know how they compare to others...the possibilities are endless.”

Government Data Forum

Also Read