43,453 research outputs found

    ํด๋ผ์šฐ๋“œ ํ™˜๊ฒฝ์—์„œ ๋น ๋ฅด๊ณ  ํšจ์œจ์ ์ธ IoT ์ŠคํŠธ๋ฆผ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ์—”๋“œ-ํˆฌ-์—”๋“œ ์ตœ์ ํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. ์—„ํƒœ๊ฑด.As a large amount of data streams are generated from Internet of Things (IoT) devices, two types of IoT stream queries are deployed in the cloud. One is a small IoT-stream query, which continuously processes a few IoT data streams of end-usersโ€™s IoT devices that have low input rates (e.g., one event per second). The other one is a big IoT-stream query, which is deployed by data scientists to continuously process a large number and huge amount of aggregated data streams that can suddenly fluctuate in a short period of time (bursty loads). However, existing work and stream systems fall short of handling such workloads efficiently because their query submission, compilation, execution, and resource acquisition layer are not optimized for the workloads. This dissertation proposes two end-to-end optimization techniquesโ€” not only optimizing stream query execution layer (runtime), but also optimizing query submission, compiler, or resource acquisition layer. First, to minimize the number of cloud machines and maintenance cost of servers in processing many small IoT queries, we build Pluto, a new stream processing system that optimizes both query submission and execution layer for efficiently handling many small IoT stream queries. By decoupling IoT query submission and its code registration and offering new APIs, Pluto mitigates the bottleneck in query submission and enables efficient resource sharing across small IoT stream queries in the execution. Second, to quickly handle sudden bursty loads and scale out big IoT stream queries, we build Sponge, which is a new stream system that optimizes query compilation, execution, and resource acquisition layer altogether. For fast acquisition of new resources, Sponge uses a new cloud computing service, called Lambda, because it offers fast-to-start lightweight containers. Sponge then converts the streaming dataflow of big stream queries to overcome Lambdaโ€™s resource constraint and to minimize scaling overheads at runtime. Our evaluations show that the end-to-end optimization techniques significantly improve system throughput and latency compared to existing stream systems in handling a large number of small IoT stream queries and in handling bursty loads of big IoT stream queries.๋‹ค์–‘ํ•œ IoT ๋””๋ฐ”์ด์Šค๋กœ๋ถ€ํ„ฐ ๋งŽ์€ ์–‘์˜ ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ๋“ค์ด ์ƒ์„ฑ๋˜๋ฉด์„œ, ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ํƒ€์ž…์˜ ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๊ฐ€ ํด๋ผ์šฐ๋“œ์—์„œ ์ˆ˜ํ–‰๋œ๋‹ค. ์ฒซ์งธ๋กœ๋Š” ์ž‘์€-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ์ด๋ฉฐ, ํ•˜๋‚˜์˜ ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๊ฐ€ ์ ์€ ์–‘์˜ IoT ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ์„ ์ฒ˜๋ฆฌํ•˜๊ณ  ๋งŽ์€ ์ˆ˜์˜ ์ž‘์€ ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๋“ค์ด ์กด์žฌํ•œ๋‹ค. ๋‘๋ฒˆ์งธ๋กœ๋Š” ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ์ด๋ฉฐ, ํ•˜๋‚˜ ์˜ ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๊ฐ€ ๋งŽ์€ ์–‘์˜, ๊ธ‰๊ฒฉํžˆ ์ฆ๊ฐ€ํ•˜๋Š” IoT ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ๋“ค์„ ์ฒ˜๋ฆฌํ•œ๋‹ค. ํ•˜์ง€๋งŒ, ๊ธฐ์กด ์—ฐ๊ตฌ์™€ ์ŠคํŠธ๋ฆผ ์‹œ์Šคํ…œ์—์„œ๋Š” ์ฟผ๋ฆฌ ์ˆ˜ํ–‰, ์ œ์ถœ, ์ปดํŒŒ์ผ๋Ÿฌ, ๋ฐ ๋ฆฌ์†Œ์Šค ํ™•๋ณด ๋ ˆ์ด์–ด๊ฐ€ ์ด๋Ÿฌํ•œ ์›Œํฌ๋กœ๋“œ์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์ง€ ์•Š์•„์„œ ์ž‘์€-IoT ๋ฐ ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜์ง€ ๋ชปํ•œ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ž‘์€-IoT ๋ฐ ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ ์›Œํฌ๋กœ๋“œ๋ฅผ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ์—”๋“œ-ํˆฌ-์—”๋“œ ์ตœ์ ํ™” ๊ธฐ๋ฒ•์„ ์†Œ๊ฐœํ•œ๋‹ค. ์ฒซ๋ฒˆ์งธ๋กœ, ๋งŽ์€ ์ˆ˜์˜ ์ž‘์€-IoT ์ŠคํŠธ๋ฆผ ์ฟผ ๋ฆฌ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด, ์ฟผ๋ฆฌ ์ œ์ถœ๊ณผ ์ˆ˜ํ–‰ ๋ ˆ์ด์–ด๋ฅผ ์ตœ์ ํ™” ํ•˜๋Š” ๊ธฐ๋ฒ•์ธ IoT ํŠน์„ฑ ๊ธฐ๋ฐ˜ ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ฟผ๋ฆฌ ์ œ์ถœ๊ณผ ์ฝ”๋“œ ๋“ฑ๋ก์„ ๋ถ„๋ฆฌํ•˜๊ณ , ์ด๋ฅผ ์œ„ํ•œ ์ƒˆ๋กœ์šด API๋ฅผ ์ œ๊ณตํ•จ์œผ๋กœ์จ, ์ฟผ๋ฆฌ ์ œ์ถœ์—์„œ์˜ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ด๊ณ  ์ฟผ๋ฆฌ ์ˆ˜ํ–‰์—์„œ IoT ํŠน ์„ฑ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฆฌ์†Œ์Šค๋ฅผ ๊ณต์œ ํ•จ์œผ๋กœ์จ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ธ๋‹ค. ๋‘๋ฒˆ์งธ๋กœ, ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ์—์„œ ๊ธ‰๊ฒฉํžˆ ์ฆ๊ฐ€ํ•˜๋Š” ๋กœ๋“œ๋ฅผ ๋น ๋ฅด๊ฒŒ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด, ์ฟผ๋ฆฌ ์ปดํŒŒ์ผ๋Ÿฌ, ์ˆ˜ํ–‰, ๋ฐ ๋ฆฌ์†Œ์Šค ํ™•๋ณด ๋ ˆ์ด์–ด ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ƒˆ๋กœ์šด ํด๋ผ์šฐ๋“œ ์ปดํ“จํŒ… ๋ฆฌ์†Œ์Šค์ธ ๋žŒ๋‹ค๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋น ๋ฅด๊ฒŒ ๋ฆฌ์†Œ์Šค๋ฅผ ํ™•๋ณดํ•˜๊ณ , ๋žŒ๋‹ค์˜ ์ œํ•œ๋œ ๋ฆฌ์†Œ์Šค์—์„œ ์Šค์ผ€์ผ-์•„์›ƒ ์˜ค ๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ์ŠคํŠธ๋ฆผ ๋ฐ์ดํ„ฐํ”Œ๋กœ์šฐ๋ฅผ ๋ฐ”๊ฟˆ์œผ๋กœ์จ ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ์˜ ์ž‘์—…๋Ÿ‰์„ ๋น ๋ฅด๊ฒŒ ๋žŒ๋‹ค๋กœ ์˜ฎ๊ธด๋‹ค. ์ตœ์ ํ™” ๊ธฐ๋ฒ•์˜ ํšจ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ธฐ ์œ„ํ•ด, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋‘๊ฐ€์ง€ ์‹œ์Šคํ…œ-Pluto ์™€ Sponge-์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์‹คํ—˜์„ ํ†ตํ•ด์„œ, ๊ฐ ์ตœ์ ํ™” ๊ธฐ๋ฒ•์„ ์ ์šฉํ•œ ๊ฒฐ๊ณผ ๊ธฐ์กด ์‹œ์Šคํ…œ ๋Œ€๋น„ ์ฒ˜๋ฆฌ๋Ÿ‰์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ์œผ๋ฉฐ, ์ง€์—ฐ์‹œ๊ฐ„์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค.Chapter 1 Introduction 1 1.1 IoT Stream Workloads 1 1.1.1 Small IoT Stream Query 2 1.1.2 Big IoT Stream Query 4 1.2 Proposed Solution 5 1.2.1 IoT-Aware Three-Phase Query Execution 6 1.2.2 Streaming Dataflow Reshaping on Lambda 7 1.3 Contribution 8 1.4 Dissertation Structure 9 Chapter 2 Background 10 2.1 Stream Query Model 10 2.2 Workload Characteristics 12 2.2.1 Small IoT Stream Query 12 2.2.2 Big IoT Stream Query 13 Chapter 3 IoT-Aware Three-Phase Query Execution 15 3.1 Pluto Design Overview 16 3.2 Decoupling of Code and Query Submission 19 3.2.1 Code Registration 19 3.2.2 Query Submission API 20 3.3 IoT-Aware Execution Model 21 3.3.1 Q-Group Creation and Query Grouping 24 3.3.2 Q-Group Assignment 24 3.3.3 Q-Group Scheduling and Processing 25 3.3.4 Load Rebalancing: Q-Group Split and Merging 28 3.4 Implementation 29 3.5 Evaluation 30 3.5.1 Methodology 30 3.5.2 Performance Comparison 34 3.5.3 Performance Breakdown 36 3.5.4 Load Rebalancing: Q-Group Split and Merging 38 3.5.5 Tradeoff 40 3.6 Discussion 41 3.7 Related Work 43 3.8 Summary 44 Chapter 4 Streaming Dataflow Reshaping for Fast Scaling Mechanism on Lambda 46 4.1 Motivation 46 4.2 Challenges 47 4.3 Design Overview 50 4.4 Reshaping Rules 51 4.4.1 R1:Inserting Router Operators 52 4.4.2 R2:Inserting Transient Operators 54 4.4.3 R3:Inserting State Merger Operators 57 4.5 Scaling Protocol 59 4.5.1 Redirection Protocol 59 4.5.2 Merging Protocol 60 4.5.3 Migration Protocol 61 4.6 Implementation 61 4.7 Evaluation 63 4.7.1 Methodology 63 4.7.2 Performance Analysis 68 4.7.3 Performance Breakdown 70 4.7.4 Latency-Cost($) Trade-Off 76 4.8 Discussion 77 4.9 Related Work 78 4.10 Summary 80 Chapter 5 Conclusion 81๋ฐ•

    Continuous Learning of HPC Infrastructure Models using Big Data Analytics and In-Memory processing Tools

    Get PDF
    open4siThis work was supported, in parts, by the FP7 ERC Advance project MULTITHERMAN (g.a. 291125), by the EU H2020 FETHPC project ANTAREX (g.a. 67623) and by the EU H2020 FETHPC project Exanode (g.a. 671578).Exascale computing represents the next leap in the HPC race. Reaching this level of performance is subject to several engineering challenges such as energy consumption, equipment-cooling, reliability and massive parallelism. Model-based optimization is an essential tool in the design process and control of energy efficient, reliable and thermally constrained systems. However, in the Exascale domain, model learning techniques tailored to the specific supercomputer require real measurements and must therefore handle and analyze a massive amount of data coming from the HPC monitoring infrastructure. This becomes rapidly a 'big data' scale problem. The common approach where measurements are first stored in large databases and then processed is no more affordable due to the increasingly storage costs and lack of real-time support. Nowadays instead, cloud-based machine learning techniques aim to build on-line models using real-time approaches such as 'stream processing' and 'in-memory' computing, that avoid storage costs and enable fastdata processing. Moreover, the fast delivery and adaptation of the models to the quick data variations, make the decision stage of the optimization loop more effective and reliable. In this paper we leverage scalable, lightweight and flexible IoT technologies, such as the MQTT protocol, to build a highly scalable HPC monitoring infrastructure able to handle the massive sensor data produced by next-gen HPC components. We then show how state-of-the art tools for big data computing and analysis, such as Apache Spark, can be used to manage the huge amount of data delivered by the monitoring layer and to build adaptive models in real-time using on-line machine learning techniques.openBeneventi, Francesco; Bartolini, Andrea; Cavazzoni, Carlo; Benini, LucaBeneventi, Francesco; Bartolini, Andrea; Cavazzoni, Carlo; Benini, Luc

    Knowledge-infused and Consistent Complex Event Processing over Real-time and Persistent Streams

    Full text link
    Emerging applications in Internet of Things (IoT) and Cyber-Physical Systems (CPS) present novel challenges to Big Data platforms for performing online analytics. Ubiquitous sensors from IoT deployments are able to generate data streams at high velocity, that include information from a variety of domains, and accumulate to large volumes on disk. Complex Event Processing (CEP) is recognized as an important real-time computing paradigm for analyzing continuous data streams. However, existing work on CEP is largely limited to relational query processing, exposing two distinctive gaps for query specification and execution: (1) infusing the relational query model with higher level knowledge semantics, and (2) seamless query evaluation across temporal spaces that span past, present and future events. These allow accessible analytics over data streams having properties from different disciplines, and help span the velocity (real-time) and volume (persistent) dimensions. In this article, we introduce a Knowledge-infused CEP (X-CEP) framework that provides domain-aware knowledge query constructs along with temporal operators that allow end-to-end queries to span across real-time and persistent streams. We translate this query model to efficient query execution over online and offline data streams, proposing several optimizations to mitigate the overheads introduced by evaluating semantic predicates and in accessing high-volume historic data streams. The proposed X-CEP query model and execution approaches are implemented in our prototype semantic CEP engine, SCEPter. We validate our query model using domain-aware CEP queries from a real-world Smart Power Grid application, and experimentally analyze the benefits of our optimizations for executing these queries, using event streams from a campus-microgrid IoT deployment.Comment: 34 pages, 16 figures, accepted in Future Generation Computer Systems, October 27, 201
    • โ€ฆ
    corecore