4 research outputs found

    Motivations and challenges for stream processing in edge computing

    Get PDF
    The 2030 Agenda for Sustainable Development of the United Nations General Assembly defines 17 development goals to be met for a sustainable future. Goals such as Industry, Innovation and Infrastructure and Sustainable Cities and Communities depend on digital systems. As a matter of fact, billions of Euros are invested into digital transformation within the European Union, and many researchers are actively working to push state-of-the-art boundaries for techniques/tools able to extract value and insights from the large amounts of raw data sensed in digital systems. Edge computing aims at supporting such data-to-value transformation. In digital systems that traditionally rely on central data gathering, edge computing proposes to push the analysis towards the devices and data sources, thus leveraging the large cumulative computational power found in modern distributed systems. Some of the ideas promoted in edge computing are not new, though. Continuous and distributed data analysis paradigms such as stream processing have argued about the need for smart distributed analysis for basically 20 years. Starting from this observation, this talk covers a set of standing challenges for smart, distributed, and continuous stream processing in edge computing, with real-world examples and use-cases from smart grids and vehicular networks

    Parallel Data Streaming Analytics in the Context of Internet of Things

    Get PDF
    We are living in an increasingly connected world, where the ubiquitously sensing technologies enable inter-connection of physical objects, as part of Internet of Things (IoT), and provide continuous massive amount of data. As this growth soars, benefits and challenges come together, which requires development of right tools in order to extract valuable information from data. For that, new techniques (e.g. data stream processing) have emerged to perform continuous single pass analysis and enhance parallelism. However, employing such techniques is not a trivial task due to its challenges such as partial knowledge of the data and the trade-off between parallelism and consistency. Moreover, depending on the source, data volumes may fluctuate over time which requires the degree of parallelism to be adapted in runtime.In this work, we contribute to the design of computational infrastructures and development of tools to address these challenges. In this regard, we focus on two problem domains. First, we target continuous data analysis and particularly focus on data clustering, as a significant representative problem, to extract information from massive data, generated by high-rate sensors. We propose Lisco, a single-pass continuous Euclidean distance-based clustering which exploits the inherent ordering of the spatial and temporal data, and its parallel counterpart, P-Lisco, to enhance pipeline- and data-parallelism. These algorithms provide high throughput of results with low latency, through pushing the processing closer to the data sources. Moreover we provide a framework, DRIVEN, that performs a continuous bounded error approximation to compress the volumes of data, and then transmits the compressed data to next layers of the IoT architecture to perform clustering on it, in a continuous fashion, using generalized form of Lisco. The compression in data retrieval speeds up the transmission of the data while preserving very similar clustering quality as raw data transmission. In the second domain, we target the elasticity in data streaming to utilize computational resources in runtime regarding the data rate fluctuations. For that, we provide a stream processing framework, STRETCH, and introduce the concept of virtual shared-nothing parallelization that is able to adapt the resources, maximize the throughput and latency, and preserve determinism. Thorough experimental evaluations on architectures representative of high-end servers and of resource-constrained embedded devices indicate the scalability benefits of all proposed algorithms

    Highly concurrent stream synchronization in many-core embedded systems

    No full text
    Embedded many-core architectures are expected to serve as significant components in the infrastructure of upcoming technologies like networks for the Internet of Things (IoT), facing real-time and stream processing challenges. In this work we explore the applicability of ScaleGate, a synchronization object from the massive data stream processing domain, on many-core embedded systems. We propose a new implementation of ScaleGate on the Epiphany architecture, a scalable embedded many-core co-processor, and study communication patterns that appear in the context of a baseband signal processing application. Our experimental evaluation shows significant improvements over standard barrier-based approaches, due to the asynchrony exploited by the use of ScaleGate

    Hardware-Aware Algorithm Designs for Efficient Parallel and Distributed Processing

    Get PDF
    The introduction and widespread adoption of the Internet of Things, together with emerging new industrial applications, bring new requirements in data processing. Specifically, the need for timely processing of data that arrives at high rates creates a challenge for the traditional cloud computing paradigm, where data collected at various sources is sent to the cloud for processing. As an approach to this challenge, processing algorithms and infrastructure are distributed from the cloud to multiple tiers of computing, closer to the sources of data. This creates a wide range of devices for algorithms to be deployed on and software designs to adapt to.In this thesis, we investigate how hardware-aware algorithm designs on a variety of platforms lead to algorithm implementations that efficiently utilize the underlying resources. We design, implement and evaluate new techniques for representative applications that involve the whole spectrum of devices, from resource-constrained sensors in the field, to highly parallel servers. At each tier of processing capability, we identify key architectural features that are relevant for applications and propose designs that make use of these features to achieve high-rate, timely and energy-efficient processing.In the first part of the thesis, we focus on high-end servers and utilize two main approaches to achieve high throughput processing: vectorization and thread parallelism. We employ vectorization for the case of pattern matching algorithms used in security applications. We show that re-thinking the design of algorithms to better utilize the resources available in the platforms they are deployed on, such as vector processing units, can bring significant speedups in processing throughout. We then show how thread-aware data distribution and proper inter-thread synchronization allow scalability, especially for the problem of high-rate network traffic monitoring. We design a parallelization scheme for sketch-based algorithms that summarize traffic information, which allows them to handle incoming data at high rates and be able to answer queries on that data efficiently, without overheads.In the second part of the thesis, we target the intermediate tier of computing devices and focus on the typical examples of hardware that is found there. We show how single-board computers with embedded accelerators can be used to handle the computationally heavy part of applications and showcase it specifically for pattern matching for security-related processing. We further identify key hardware features that affect the performance of pattern matching algorithms on such devices, present a co-evaluation framework to compare algorithms, and design a new algorithm that efficiently utilizes the hardware features.In the last part of the thesis, we shift the focus to the low-power, resource-constrained tier of processing devices. We target wireless sensor networks and study distributed data processing algorithms where the processing happens on the same devices that generate the data. Specifically, we focus on a continuous monitoring algorithm (geometric monitoring) that aims to minimize communication between nodes. By deploying that algorithm in action, under realistic environments, we demonstrate that the interplay between the network protocol and the application plays an important role in this layer of devices. Based on that observation, we co-design a continuous monitoring application with a modern network stack and augment it further with an in-network aggregation technique. In this way, we show that awareness of the underlying network stack is important to realize the full potential of the continuous monitoring algorithm.The techniques and solutions presented in this thesis contribute to better utilization of hardware characteristics, across a wide spectrum of platforms. We employ these techniques on problems that are representative examples of current and upcoming applications and contribute with an outlook of emerging possibilities that can build on the results of the thesis
    corecore