March 2021

Promoting a more data-driven approach to flood risk management in the age of big data

In a recently published paper [1], the Ensemble team along with JBA Trust and the Environment Agency discuss how to support more data-driven decision support in flood risk management in an age of big data. This paper represents a major milestone for the project, reflecting the understanding gained from our very first sprint on this topic.

As we have witnessed recently, flood risk management is a huge challenge for society and we are witnessing an increasing number of extreme weather events leading to flooding. Flood risk management has always relied on data but recently there has been an explosion in the availability of data (cf. the age of big data). This is coming from a wide variety of sources, from ground-based surveys, aircraft and satellites or from an increasing number of in-situ sensors, from citizen science or from data scraped from social media and the web more generally, and this is also coupled with an equivalent explosion in data outputs from model runs. It is no surprise that there is a risk of all parties involved in flood risk management being overwhelmed by this data (we should ignore the temptation to say ‘drowning’ in this data). So there is potential in this data but equally there is a need to provide tool support in managing this data and supporting decision making in the end-to-end pathway from data acquisition to decision-making. The problems of managing this data are exacerbated by the fact that this data is highly complex, being heterogeneous in nature and existing at a wide variety of scales.

The overall goal of our first sprint was to look at the role of cloud computing and contemporary innovations in this area in supporting a more data-driven approach to flood risk management. The work was inspired by the UK’s National Flood Resilience Review (NFFR) which highlights the needs for a dual approach to flood risk management, one which makes use of this increasing availability of data through a statistical pathway to be integrated with a more traditional integrated process model approach (their so called dynamical pathway). The work is also strongly aligned with the goals of the National Flood Risk Assessment 2 project (NaFRA2).

We took a novel approach in this first sprint, employing an agile research methodology where we iterated towards our eventual solution by early and continuous delivery of software prototypes together with continuous feedback from our partners in the project. This approach allowed us to fold in partners as an intrinsic part of the research process and also strongly supported cross-disciplinary dialogue. The approach also allowed us to explore what is both technically feasible and desirable in the application of cloud computing technologies to this area.

The research identified the concept of a data hypercube as a mechanism to achieve integration of the range of heterogeneous data sources (see diagram below as an illustration of a portion of such a hypercube). We also demonstrated how this concept could be implemented through a suggested layered software architecture making use of underlying cloud storage and computation services, semantic web and query technologies and notebook technologies to support open, transparent, collaborative and reproducible studies in support of decision-making. We also investigated how we could use AI techniques to extract useful information from unstructured sources, such as Local Authority flood event reports that are often only captured digitally as pdf reports. Through this we achieved a useful integration of both structured and unstructured data types. The approach was demonstrated through a case study based on how different communities are at risk from flooding events.

hypercube diagram
A section of the hypercube diagram


We have been strongly encouraged by the response to our studies. Starting with our partners, Sue Manson from the Environment Agency in reflecting about the project, stated:

“In a data driven world where more and more data is becoming available, this data driven approach is going to be key to us in the future. We’ve been taking forward some of that thinking in terms of feeding into a new product for national flood risk assessment and we’re also looking at this for some of our research projects in terms of crowd sourcing and other avenues of data that aren’t normally available to us.”


“The work from the Ensemble project has influenced our future thinking as we frequently haven’t thought about data and computer specialists from the outset of a project, so we bring them in further down the line, whereas the Ensemble project has shown the value of having that collective team from the outset, working together to find solutions that you didn’t even think were possible at the outset.”

Prof, Rob Lamb from JBA Trust also noted:

“We started working with the Ensemble team by asking “how might contemporary ideas in data science and digital technology change the way we think about flood risk models?”. The work with Ensemble helped us to bring together six professional partners and critically re-assess flood risk modelling from the viewpoints of multiple stakeholders. It points the way to a more realistic approach to risk analysis that will allow different people to find diverse perspectives on risk whilst drawing on shared underlying evidence.”

We are delighted to say that this work is far from finished. We are currently working with JBA Consulting in a follow-up Knowledge Transfer Partnership developing further the ideas that emerged from this sprint. We are also feeding in our thoughts to NaFRA2, with this contract having been awarded to a consortium involving JBA Consulting (along with Jacobs). We are also delighted to say that a second paper has recently been published focussing more on the semantic web and natural language processing aspects of the work [2]. There is a long way to go but we look forward to the day when we can tame this deluge of big data and use the resultant insight and knowledge gained from this data to support more informed decision making going forward in flood risk management.

Prof. Gordon Blair

EPSRC Senior Fellow in Digital Technology and Living with Environmental Change

28st January, 2021
Updated 21st April 2021


[1] Towe, R, Dean, G, Edwards, L, Nundloll, V., Blair, G.S., Lamb, R., Hankin, B., Manson, S. (2020). Rethinking data‐driven decision support in flood risk management for a big data age. Flood Risk Management, 13:4.

[2] Nundloll, V., Lamb, R., Hankin, B., Blair, G.S. (2021). A semantic approach to enable data integration for the domain of flood risk management. Environmental Challenges.

Header Image: Bingley Boxing Day Floods 2015, by Chris Gallagher on Unsplash