August 2018

On Complexity in Distributed Systems

In early July, I was in Vienna, attending and presenting at the 38th IEEE International Conference on Distributed Computing Systems. In some ways, this felt like going home as distributed systems has been my core disciplinary area for over thirty years. In other ways, I felt like an outsider as I have been working on more applied areas over the last five years, especially around how distributed systems technologies such as cloud computing and the Internet of Things can help Environmental Scientists in understanding and responding to issues such as climate change. This intellectual distance though was really useful in assessing the health of research in distributed systems and where the area is going as a subject.

My main reason for being in Vienna was to present a paper in the Vision/Blue Sky Thinking Track (a terrific initiative by ICDCS to encourage more disruptive, long-term thinking about the area). My paper was on “Complex Distributed Systems: The Need for Fresh Perspectives” [1], drawing on my experience in the fellowship and beyond on working with Environmental Scientists. The real focus of the paper is on complexity: on the complexity of contemporary environmental science as they tackle big scientific questions in the context of a rapidly changing environment; and on the complexity of the underlying distributed systems that are required to support this science; and in particular, how to carry out complex science in the context of a complex computational infrastructure. The paper comments that while the field of distributed systems has been successful in dealing with some aspects of complexity, most notably the very large scale systems that are emerging, the field has failed to address other key aspects of complexity around the extreme heterogeneity that exists in such systems.

Imagine a technological infrastructure to support environmental monitoring and management. Such a system will inevitably support a range of sources of data, from high-volume satellite imagery through to the increasing availability of data from sensors in the natural environment (effectively an environmental Internet of Things). This data will typically need to flow into the cloud where it can be combined with historical data for subsequent analysis, but the cloud is not a uniform place but rather supports a wide range of APIs and capabilities. This data will also need to be interrogated from a range of devices, including support for mobile devices, for example to facilitate emergency response to natural disasters such as flooding. Very quickly, we move towards a highly complex distributed infrastructure, raising questions of how do we programme such complex systems, and how do we provide a level of abstraction so that end users (scientists, policy makers, etc), can use such systems without being swamped by the underlying complexity of the technological infrastructure (surely we want scientists to do science rather than struggling with Linux scripts and distributed systems algorithms).

In undergraduate courses on distributed systems, we are taught about middleware and how middleware provides programming models that abstract away from the complexity of distributed systems, but what does a middleware look like given the levels of complexity we are talking about above. To quote from the paper:

“It is clear that the solution is not just about making ‘better’ middleware. Rather, the author argues that there is a need to fundamentally rethink the role of middleware in what we refer to as complex distributed systems […] This paper is intended to provoke the distributed systems community to stand back from the body of work amassed over the last 30 years or so, and to go back to basics, to rethink the very foundations of distributed systems and middleware.”

The paper then goes into detail on fresh perspective on the role, purpose and form of middleware for the future, highlighting the need to: embrace systems of systems thinking in middleware, where we reason about domains and their boundaries; significantly raise the levels of abstraction of platforms, so that we move from systems-oriented abstractions to application-oriented abstractions; consider alternative structures for middleware that are more fluid (cf. emergent middleware) – please refer to the paper for more details.

As a final reflection, I heard some great papers at ICDCS but it feels like a community that is working on the detail (very important) but one that perhaps needs also to think about the big picture. In other words, we need the engineering and the architecture, so we can address questions such as what does a middleware look like for highly complex distributed systems, and how do we make the resultant middleware efficient. This feels very much like the transition I have experienced in the environmental sciences over the last decade or so. At one time, such scientists would be working on really specific issues, for example over the chemical composition of soils; now they are being asked to answer questions around soils management in the context of complex ecosystems. Perhaps the distributed systems community needs to go through a similar metamorphosis so we produce technologies that are fit for purpose for some of the grand challenges of our time.

Gordon Blair

Distinguished Professor of Distributed Systems, and
EPSRC Senior Fellow in Digital Technology and Living with Environmental Change

[1] Blair, Gordon S. (2018) Complex Distributed Systems: The Need for Fresh Perspectives. In: Proceedings of the 38th IEEE International Conference on Distributed Computing Systems: IEEE. (In Press), preprint available.