|
IEEE DS Online Exclusive Content From the Editors: Middleware Community Sensor Networks + Grid Computing = A New Challenge for the Grid? •• We’ve seen great strides over the past five years in two key areas of distributed systems technology. First, sensor networks have emerged as an approach to instrumenting the physical environment. Second, Grid computing has developed as a way to support large-scale distributed computation using shared heterogeneous pooled resources across administrative domains. Consequently, the opportunity naturally arises for a confluence between these two increasingly mature areas. A new class of distributed applications could even emerge—in fact, there’s clear evidence that this is happening. These applications, which we call pervasive management support systems, combine sensor network and Grid technologies to support the ongoing management of evolving physical systems. For example, PMS systems have recently been developed to help predict and manage environmental threats such as fire or flooding,1,2 to aid emergency rescue services,3 and to monitor and react to strains in engineering artifacts such as bridges4 and aircraft engines.5 Essentially, PMS systems marry reality and simulation and support sophisticated what-if analysis to help people make faster and better real-world decisions. They have two key defining characteristics:
In addition, some PMS systems incorporate group work support so that geographically distributed experts can collaboratively interact with the simulations and visualizations, discuss the ongoing situation’s implications, and coordinate its management. PMS systems might also incorporate decision support software that semiautomates the associated decision-making process. Again, such functions must occur in real time. Furthermore, because many potential application areas are safety critical and might carry significant financial penalties in case of failure, both the basic simulation and visualization functions and any auxiliary functionality (such as group working and decision support) must meet appropriate criticality constraints. The system must tolerate both hardware and software failures and not fail at a critical moment owing to overload. A case studyThe Cowbridge project at Lancaster University employs the PMS approach to predict and manage flooding in a river valley in the north of England.2 The system incorporates an adaptive, resilient sensor network that feeds real-time sensor data (for example, on water depth, spread, and flow rates) to a computationally intensive flood-prediction algorithm running on a general-purpose computational cluster in a customized Grid middleware environment. Essentially, the system enables real-time prediction of flooding including detailed predictions of what areas flooding will most likely affect. It also provides timely alerts to local stakeholders (like residents and motorists) when it perceives flooding to be imminent. An interesting technical aspect of this system is that it supports selectively and dynamically assigning computation tasks to either the sensor network itself or the remote cluster. For example, the optimal processing site for the system’s flow-rate modeling subsystem depends on the speed and current loading of the General Packet Radio Service link between the sensor network and the cluster. This flow-rate modeling subsystem deduces river flow rates on the basis of the speed and trajectory of pieces of flotsam that are tracked by successive video cameras mounted along the river. If the GPRS link is heavily loaded (for example, because other parts of the system are using it), it might be infeasible to use it to transmit raw video image bitmaps and manipulate them on the cluster side. In this case, therefore, it’s best to do the processing locally using spare computational capacity on sensor network nodes. On the other hand, if the link is relatively unloaded and/or the sensor batteries are low, it could be better to perform the processing on the cluster and accept the overhead of shipping the images across the link. Why the Grid?So how and why is the Grid involved in such systems? An alternative means of supporting the heavy-duty computational requirements of PMS systems would be to employ a dedicated computational facility—an approach that many embryonic PMS projects take. Although this approach has its benefits, leveraging the Grid offers a potentially far more attractive solution. First, by employing pooled resources, a Grid solution can be cheaper. Second, it can offer a more flexible solution that can dynamically scale up to the computational power that the ongoing PMS application needs at any given moment. And finally, and equally important, it can potentially call on sufficient redundancy to underpin PMS applications’ current criticality requirements. Technological challengesUnfortunately, current resource management middleware for the Grid isn’t up to the task, primarily because its resource-management strategies are inappropriately targeted and insufficiently agile. First, current grid middleware schedules jobs to optimize average throughput rather than per-job completion time. But this doesn’t accommodate PMS simulations’ specific timeliness needs; such simulations obviously need guarantees that they can keep up with the rate of incoming real-time data from the sensor network, and such guarantees require a priority- or deadline-driven scheduler. Second, PMS systems require an entirely new pattern of Grid resource demand, namely the ability to spawn, with millisecond latency, extended or new simulations in response to unexpected real-world events from the sensor network. Third, an “agile” resource manager must take into account locality and connectivity information across the whole system (incorporating both the sensor network and the Grid) when allocating computational resources. This is clearly required in cases such as the video-image-bitmap processing we discussed for modeling river flow rates. A second key area of weakness in current technologies lies at the interface between the sensor network and Grid domains. Traditionally, sensor network designers assume a single “sink node” to which data is directed for subsequent storage, processing, and analysis. However, given PMS systems’ timeliness and criticality requirements, this is clearly insufficient. First, data transfer from the sensor network to the Grid must occur in real time. Second, the middleware must adequately support the criticality needs of PMS applications—perhaps in terms of providing redundant or backup gateways or links between the two domains. Third, merely providing physical connectivity isn’t enough—the middleware must also provide communications abstractions that suitably reflect the application-specific manner in which data is transferred across the link. For example, a media-streaming abstraction might be appropriate in some contexts, whereas a publish-subscribe abstraction might be better in others. And finally, the whole of the integrated infrastructure must be adaptive and self-managing to accommodate inevitable partial failures, transient overloads, and changing timeliness and criticality demands (for example, as the system moves from a quiescent state to a potential emergency condition). The required research agenda to adequately support grid-based PMS systems is a demanding one. Essentially, we need a new mind-set for Grid computing that moves us away from thinking of grids as large-scale “batch processors” and toward thinking of them as flexible, agile, autonomic computational facilities that are well integrated with other networking domains (not only sensor networks but also potentially ad hoc wireless networks). Unfortunately, although PMS applications are becoming more prominent, we think that research activity in such areas is insufficient. Instead, many in the Grid research community assume that basic resource management problems are already solved. Accordingly, researchers have moved on to higher-level concerns such as how best to export Grid computational capacity as Web services or how to support secure virtual organizations. Without at all denying such concerns’ importance, we believe that revisiting the Grid's lower “systems” layers is equally essential if it’s to play a part in the broader application space that PMS researchers envision. References
Geoff Coulson is a professor at Lancaster University. Contact him at geoff@comp.lancs.ac.uk. Dean Kuo is a Grid architect at the University of Manchester. Contact him at dean.kuo@manchester.ac.uk. John Brooke is a professor at the University of Manchester. Contact him at john.brooke@manchester.ac.uk.
|


