3,673 research outputs found

    Efficient Data Streaming Analytic Designs for Parallel and Distributed Processing

    Get PDF
    Today, ubiquitously sensing technologies enable inter-connection of physical\ua0objects, as part of Internet of Things (IoT), and provide massive amounts of\ua0data streams. In such scenarios, the demand for timely analysis has resulted in\ua0a shift of data processing paradigms towards continuous, parallel, and multitier\ua0computing. However, these paradigms are followed by several challenges\ua0especially regarding analysis speed, precision, costs, and deterministic execution.\ua0This thesis studies a number of such challenges to enable efficient continuous\ua0processing of streams of data in a decentralized and timely manner.In the first part of the thesis, we investigate techniques aiming at speeding\ua0up the processing without a loss in precision. The focus is on continuous\ua0machine learning/data mining types of problems, appearing commonly in IoT\ua0applications, and in particular continuous clustering and monitoring, for which\ua0we present novel algorithms; (i) Lisco, a sequential algorithm to cluster data\ua0points collected by LiDAR (a distance sensor that creates a 3D mapping of the\ua0environment), (ii) p-Lisco, the parallel version of Lisco to enhance pipeline- and\ua0data-parallelism of the latter, (iii) pi-Lisco, the parallel and incremental version\ua0to reuse the information and prevent redundant computations, (iv) g-Lisco, a\ua0generalized version of Lisco to cluster any data with spatio-temporal locality\ua0by leveraging the implicit ordering of the data, and (v) Amble, a continuous\ua0monitoring solution in an industrial process.In the second part, we investigate techniques to reduce the analysis costs\ua0in addition to speeding up the processing while also supporting deterministic\ua0execution. The focus is on problems associated with availability and utilization\ua0of computing resources, namely reducing the volumes of data, involving\ua0concurrent computing elements, and adjusting the level of concurrency. For\ua0that, we propose three frameworks; (i) DRIVEN, a framework to continuously\ua0compress the data and enable efficient transmission of the compact data in the\ua0processing pipeline, (ii) STRATUM, a framework to continuously pre-process\ua0the data before transferring the later to upper tiers for further processing, and\ua0(iii) STRETCH, a framework to enable instantaneous elastic reconfigurations\ua0to adjust intra-node resources at runtime while ensuring determinism.The algorithms and frameworks presented in this thesis contribute to an\ua0efficient processing of data streams in an online manner while utilizing available\ua0resources. Using extensive evaluations, we show the efficiency and achievements\ua0of the proposed techniques for IoT representative applications that involve a\ua0wide spectrum of platforms, and illustrate that the performance of our work\ua0exceeds that of state-of-the-art techniques

    Edge Video Analytics: A Survey on Applications, Systems and Enabling Techniques

    Full text link
    Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. The basic concepts of EVA (e.g., definition, architectures) were not fully elucidated due to the rapid development of this domain. To fill these gaps, we provide a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.Comment: 31 pages, 13 figure

    Timely processing of big data in collaborative large-scale distributed systems

    Get PDF
    Today’s Big Data phenomenon, characterized by huge volumes of data produced at very high rates by heterogeneous and geographically dispersed sources, is fostering the employment of large-scale distributed systems in order to leverage parallelism, fault tolerance and locality awareness with the aim of delivering suitable performances. Among the several areas where Big Data is gaining increasing significance, the protection of Critical Infrastructure is one of the most strategic since it impacts on the stability and safety of entire countries. Intrusion detection mechanisms can benefit a lot from novel Big Data technologies because these allow to exploit much more information in order to sharpen the accuracy of threats discovery. A key aspect for increasing even more the amount of data at disposal for detection purposes is the collaboration (meant as information sharing) among distinct actors that share the common goal of maximizing the chances to recognize malicious activities earlier. Indeed, if an agreement can be found to share their data, they all have the possibility to definitely improve their cyber defenses. The abstraction of Semantic Room (SR) allows interested parties to form trusted and contractually regulated federations, the Semantic Rooms, for the sake of secure information sharing and processing. Another crucial point for the effectiveness of cyber protection mechanisms is the timeliness of the detection, because the sooner a threat is identified, the faster proper countermeasures can be put in place so as to confine any damage. Within this context, the contributions reported in this thesis are threefold * As a case study to show how collaboration can enhance the efficacy of security tools, we developed a novel algorithm for the detection of stealthy port scans, named R-SYN (Ranked SYN port scan detection). We implemented it in three distinct technologies, all of them integrated within an SR-compliant architecture that allows for collaboration through information sharing: (i) in a centralized Complex Event Processing (CEP) engine (Esper), (ii) in a framework for distributed event processing (Storm) and (iii) in Agilis, a novel platform for batch-oriented processing which leverages the Hadoop framework and a RAM-based storage for fast data access. Regardless of the employed technology, all the evaluations have shown that increasing the number of participants (that is, increasing the amount of input data at disposal), allows to improve the detection accuracy. The experiments made clear that a distributed approach allows for lower detection latency and for keeping up with higher input throughput, compared with a centralized one. * Distributing the computation over a set of physical nodes introduces the issue of improving the way available resources are assigned to the elaboration tasks to execute, with the aim of minimizing the time the computation takes to complete. We investigated this aspect in Storm by developing two distinct scheduling algorithms, both aimed at decreasing the average elaboration time of the single input event by decreasing the inter-node traffic. Experimental evaluations showed that these two algorithms can improve the performance up to 30%. * Computations in online processing platforms (like Esper and Storm) are run continuously, and the need of refining running computations or adding new computations, together with the need to cope with the variability of the input, requires the possibility to adapt the resource allocation at runtime, which entails a set of additional problems. Among them, the most relevant concern how to cope with incoming data and processing state while the topology is being reconfigured, and the issue of temporary reduced performance. At this aim, we also explored the alternative approach of running the computation periodically on batches of input data: although it involves a performance penalty on the elaboration latency, it allows to eliminate the great complexity of dynamic reconfigurations. We chose Hadoop as batch-oriented processing framework and we developed some strategies specific for dealing with computations based on time windows, which are very likely to be used for pattern recognition purposes, like in the case of intrusion detection. Our evaluations provided a comparison of these strategies and made evident the kind of performance that this approach can provide

    A sub-mW IoT-endnode for always-on visual monitoring and smart triggering

    Full text link
    This work presents a fully-programmable Internet of Things (IoT) visual sensing node that targets sub-mW power consumption in always-on monitoring scenarios. The system features a spatial-contrast 128x64128\mathrm{x}64 binary pixel imager with focal-plane processing. The sensor, when working at its lowest power mode (10μW10\mu W at 10 fps), provides as output the number of changed pixels. Based on this information, a dedicated camera interface, implemented on a low-power FPGA, wakes up an ultra-low-power parallel processing unit to extract context-aware visual information. We evaluate the smart sensor on three always-on visual triggering application scenarios. Triggering accuracy comparable to RGB image sensors is achieved at nominal lighting conditions, while consuming an average power between 193μW193\mu W and 277μW277\mu W, depending on context activity. The digital sub-system is extremely flexible, thanks to a fully-programmable digital signal processing engine, but still achieves 19x lower power consumption compared to MCU-based cameras with significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa
    • …
    corecore