51 research outputs found

    AWAIT: Efficient Overload Management for Busy Multi-tier Web Services under Bursty Workloads

    Get PDF
    The problem of service differentiation and admission control in web services that utilize a multi-tier architecture is more challenging than in a single-tiered one, especially in the presence of bursty conditions, i.e., when arrivals of user web sessions to the system are characterized by temporal surges in their arrival intensities and demands. We demonstrate that classic techniques for a session based admission control that are triggered by threshold violations are ineffective under bursty workload conditions, as user-perceived performance metrics rapidly and dramatically deteriorate, inadvertently leading the system to reject requests from already accepted user sessions, resulting in business loss. Here, as a solution for service differentiation of accepted user sessions we promote a methodology that is based on blocking, i.e., when the system operates in overload, requests from accepted sessions are not rejected but are instead stored in a blocking queue that effectively acts as a waiting room. The requests in the blocking queue implicitly become of higher priority and are served immediately after load subsides. Residence in the blocking queue comes with a performance cost as blocking time adds to the perceived end-to-end user response time. We present a novel autonomic session based admission control policy, called AWAIT, that adaptively adjusts the capacity of the blocking queue as a function of workload burstiness in order to meet predefined user service level objectives while keeping the portion of aborted accepted sessions to a minimum. Detailed simulations illustrate the effectiveness of AWAIT under different workload burstiness profiles and therefore strongly argue for its effectiveness

    Dependence-driven techniques in system design

    Get PDF
    Burstiness in workloads is often found in multi-tier architectures, storage systems, and communication networks. This feature is extremely important in system design because it can significantly degrade system performance and availability. This dissertation focuses on how to use knowledge of burstiness to develop new techniques and tools for performance prediction, scheduling, and resource allocation under bursty workload conditions.;For multi-tier enterprise systems, burstiness in the service times is catastrophic for performance. Via detailed experimentation, we identify the cause of performance degradation on the persistent bottleneck switch among various servers. This results in an unstable behavior that cannot be captured by existing capacity planning models. In this dissertation, beyond identifying the cause and effects of bottleneck switch in multi-tier systems, we also propose modifications to the classic TPC-W benchmark to emulate bursty arrivals in multi-tier systems.;This dissertation also demonstrates how burstiness can be used to improve system performance. Two dependence-driven scheduling policies, SWAP and ALoC, are developed. These general scheduling policies counteract burstiness in workloads and maintain high availability by delaying selected requests that contribute to burstiness. Extensive experiments show that both SWAP and ALoC achieve good estimates of service times based on the knowledge of burstiness in the service process. as a result, SWAP successfully approximates the shortest job first (SJF) scheduling without requiring a priori information of job service times. ALoC adaptively controls system load by infinitely delaying only a small fraction of the incoming requests.;The knowledge of burstiness can also be used to forecast the length of idle intervals in storage systems. In practice, background activities are scheduled during system idle times. The scheduling of background jobs is crucial in terms of the performance degradation of foreground jobs and the utilization of idle times. In this dissertation, new background scheduling schemes are designed to determine when and for how long idle times can be used for serving background jobs, without violating predefined performance targets of foreground jobs. Extensive trace-driven simulation results illustrate that the proposed schemes are effective and robust in a wide range of system conditions. Furthermore, if there is burstiness within idle times, then maintenance features like disk scrubbing and intra-disk data redundancy can be successfully scheduled as background activities during idle times

    BURN: Enabling Workload Burstiness in Customized Service Benchmarks

    No full text

    PERFUME: Power and performance guarantee with fuzzy MIMO control in virtualized servers

    Full text link
    Abstractโ€”It is important but challenging to assure the per-formance of multi-tier Internet applications with the power consumption cap of virtualized server clusters mainly due to system complexity of shared infrastructure and dynamic and bursty nature of workloads. This paper presents PERFUME, a system that simultaneously guarantees power and performance targets with flexible tradeoffs while assuring control accuracy and system stability. Based on the proposed fuzzy MIMO control technique, it accurately controls both the throughput and percentile-based response time of multi-tier applications due to its novel fuzzy modeling that integrates strengths of fuzzy logic, MIMO control and artificial neural network. It is self-adaptive to highly dynamic and bursty workloads due to online learning of control model parameters using a computationally efficient weighted recursive least-squares method. We implement PERFUME in a testbed of virtualized blade servers hosting two multi-tier RUBiS applications. Experimental results demonstrate its control accuracy, system stability, flexibility in selecting trade-offs between conflicting targets and robustness against highly dynamic variation and burstiness in workloads. It outperforms a representative utility based approach in providing guarantee of the system throughput, percentile-based response time and power budget in the face of highly dynamic and bursty workloads. I

    Parallel Continuous Preference Queries over Out-of-Order and Bursty Data Streams

    Get PDF
    Techniques to handle traffic bursts and out-of-order arrivals are of paramount importance to provide real-time sensor data analytics in domains like traffic surveillance, transportation management, healthcare and security applications. In these systems the amount of raw data coming from sensors must be analyzed by continuous queries that extract value-added information used to make informed decisions in real-time. To perform this task with timing constraints, parallelism must be exploited in the query execution in order to enable the real-time processing on parallel architectures. In this paper we focus on continuous preference queries, a representative class of continuous queries for decision making, and we propose a parallel query model targeting the efficient processing over out-of-order and bursty data streams. We study how to integrate punctuation mechanisms in order to enable out-of-order processing. Then, we present advanced scheduling strategies targeting scenarios with different burstiness levels, parameterized using the index of dispersion quantity. Extensive experiments have been performed using synthetic datasets and real-world data streams obtained from an existing real-time locating system. The experimental evaluation demonstrates the efficiency of our parallel solution and its effectiveness in handling the out-of-orderness degrees and burstiness levels of real-world applications

    Coordinated VM Resizing and Server Tuning: Throughput, Power Efficiency and Scalability

    Full text link

    ํด๋ผ์šฐ๋“œ ํ™˜๊ฒฝ์—์„œ ๋น ๋ฅด๊ณ  ํšจ์œจ์ ์ธ IoT ์ŠคํŠธ๋ฆผ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ์—”๋“œ-ํˆฌ-์—”๋“œ ์ตœ์ ํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. ์—„ํƒœ๊ฑด.As a large amount of data streams are generated from Internet of Things (IoT) devices, two types of IoT stream queries are deployed in the cloud. One is a small IoT-stream query, which continuously processes a few IoT data streams of end-usersโ€™s IoT devices that have low input rates (e.g., one event per second). The other one is a big IoT-stream query, which is deployed by data scientists to continuously process a large number and huge amount of aggregated data streams that can suddenly fluctuate in a short period of time (bursty loads). However, existing work and stream systems fall short of handling such workloads efficiently because their query submission, compilation, execution, and resource acquisition layer are not optimized for the workloads. This dissertation proposes two end-to-end optimization techniquesโ€” not only optimizing stream query execution layer (runtime), but also optimizing query submission, compiler, or resource acquisition layer. First, to minimize the number of cloud machines and maintenance cost of servers in processing many small IoT queries, we build Pluto, a new stream processing system that optimizes both query submission and execution layer for efficiently handling many small IoT stream queries. By decoupling IoT query submission and its code registration and offering new APIs, Pluto mitigates the bottleneck in query submission and enables efficient resource sharing across small IoT stream queries in the execution. Second, to quickly handle sudden bursty loads and scale out big IoT stream queries, we build Sponge, which is a new stream system that optimizes query compilation, execution, and resource acquisition layer altogether. For fast acquisition of new resources, Sponge uses a new cloud computing service, called Lambda, because it offers fast-to-start lightweight containers. Sponge then converts the streaming dataflow of big stream queries to overcome Lambdaโ€™s resource constraint and to minimize scaling overheads at runtime. Our evaluations show that the end-to-end optimization techniques significantly improve system throughput and latency compared to existing stream systems in handling a large number of small IoT stream queries and in handling bursty loads of big IoT stream queries.๋‹ค์–‘ํ•œ IoT ๋””๋ฐ”์ด์Šค๋กœ๋ถ€ํ„ฐ ๋งŽ์€ ์–‘์˜ ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ๋“ค์ด ์ƒ์„ฑ๋˜๋ฉด์„œ, ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ํƒ€์ž…์˜ ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๊ฐ€ ํด๋ผ์šฐ๋“œ์—์„œ ์ˆ˜ํ–‰๋œ๋‹ค. ์ฒซ์งธ๋กœ๋Š” ์ž‘์€-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ์ด๋ฉฐ, ํ•˜๋‚˜์˜ ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๊ฐ€ ์ ์€ ์–‘์˜ IoT ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ์„ ์ฒ˜๋ฆฌํ•˜๊ณ  ๋งŽ์€ ์ˆ˜์˜ ์ž‘์€ ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๋“ค์ด ์กด์žฌํ•œ๋‹ค. ๋‘๋ฒˆ์งธ๋กœ๋Š” ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ์ด๋ฉฐ, ํ•˜๋‚˜ ์˜ ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๊ฐ€ ๋งŽ์€ ์–‘์˜, ๊ธ‰๊ฒฉํžˆ ์ฆ๊ฐ€ํ•˜๋Š” IoT ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ๋“ค์„ ์ฒ˜๋ฆฌํ•œ๋‹ค. ํ•˜์ง€๋งŒ, ๊ธฐ์กด ์—ฐ๊ตฌ์™€ ์ŠคํŠธ๋ฆผ ์‹œ์Šคํ…œ์—์„œ๋Š” ์ฟผ๋ฆฌ ์ˆ˜ํ–‰, ์ œ์ถœ, ์ปดํŒŒ์ผ๋Ÿฌ, ๋ฐ ๋ฆฌ์†Œ์Šค ํ™•๋ณด ๋ ˆ์ด์–ด๊ฐ€ ์ด๋Ÿฌํ•œ ์›Œํฌ๋กœ๋“œ์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์ง€ ์•Š์•„์„œ ์ž‘์€-IoT ๋ฐ ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜์ง€ ๋ชปํ•œ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ž‘์€-IoT ๋ฐ ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ ์›Œํฌ๋กœ๋“œ๋ฅผ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ์—”๋“œ-ํˆฌ-์—”๋“œ ์ตœ์ ํ™” ๊ธฐ๋ฒ•์„ ์†Œ๊ฐœํ•œ๋‹ค. ์ฒซ๋ฒˆ์งธ๋กœ, ๋งŽ์€ ์ˆ˜์˜ ์ž‘์€-IoT ์ŠคํŠธ๋ฆผ ์ฟผ ๋ฆฌ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด, ์ฟผ๋ฆฌ ์ œ์ถœ๊ณผ ์ˆ˜ํ–‰ ๋ ˆ์ด์–ด๋ฅผ ์ตœ์ ํ™” ํ•˜๋Š” ๊ธฐ๋ฒ•์ธ IoT ํŠน์„ฑ ๊ธฐ๋ฐ˜ ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ฟผ๋ฆฌ ์ œ์ถœ๊ณผ ์ฝ”๋“œ ๋“ฑ๋ก์„ ๋ถ„๋ฆฌํ•˜๊ณ , ์ด๋ฅผ ์œ„ํ•œ ์ƒˆ๋กœ์šด API๋ฅผ ์ œ๊ณตํ•จ์œผ๋กœ์จ, ์ฟผ๋ฆฌ ์ œ์ถœ์—์„œ์˜ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ด๊ณ  ์ฟผ๋ฆฌ ์ˆ˜ํ–‰์—์„œ IoT ํŠน ์„ฑ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฆฌ์†Œ์Šค๋ฅผ ๊ณต์œ ํ•จ์œผ๋กœ์จ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ธ๋‹ค. ๋‘๋ฒˆ์งธ๋กœ, ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ์—์„œ ๊ธ‰๊ฒฉํžˆ ์ฆ๊ฐ€ํ•˜๋Š” ๋กœ๋“œ๋ฅผ ๋น ๋ฅด๊ฒŒ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด, ์ฟผ๋ฆฌ ์ปดํŒŒ์ผ๋Ÿฌ, ์ˆ˜ํ–‰, ๋ฐ ๋ฆฌ์†Œ์Šค ํ™•๋ณด ๋ ˆ์ด์–ด ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ƒˆ๋กœ์šด ํด๋ผ์šฐ๋“œ ์ปดํ“จํŒ… ๋ฆฌ์†Œ์Šค์ธ ๋žŒ๋‹ค๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋น ๋ฅด๊ฒŒ ๋ฆฌ์†Œ์Šค๋ฅผ ํ™•๋ณดํ•˜๊ณ , ๋žŒ๋‹ค์˜ ์ œํ•œ๋œ ๋ฆฌ์†Œ์Šค์—์„œ ์Šค์ผ€์ผ-์•„์›ƒ ์˜ค ๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ์ŠคํŠธ๋ฆผ ๋ฐ์ดํ„ฐํ”Œ๋กœ์šฐ๋ฅผ ๋ฐ”๊ฟˆ์œผ๋กœ์จ ํฐ-IoT ์ŠคํŠธ๋ฆผ ์ฟผ๋ฆฌ์˜ ์ž‘์—…๋Ÿ‰์„ ๋น ๋ฅด๊ฒŒ ๋žŒ๋‹ค๋กœ ์˜ฎ๊ธด๋‹ค. ์ตœ์ ํ™” ๊ธฐ๋ฒ•์˜ ํšจ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ธฐ ์œ„ํ•ด, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋‘๊ฐ€์ง€ ์‹œ์Šคํ…œ-Pluto ์™€ Sponge-์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์‹คํ—˜์„ ํ†ตํ•ด์„œ, ๊ฐ ์ตœ์ ํ™” ๊ธฐ๋ฒ•์„ ์ ์šฉํ•œ ๊ฒฐ๊ณผ ๊ธฐ์กด ์‹œ์Šคํ…œ ๋Œ€๋น„ ์ฒ˜๋ฆฌ๋Ÿ‰์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ์œผ๋ฉฐ, ์ง€์—ฐ์‹œ๊ฐ„์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค.Chapter 1 Introduction 1 1.1 IoT Stream Workloads 1 1.1.1 Small IoT Stream Query 2 1.1.2 Big IoT Stream Query 4 1.2 Proposed Solution 5 1.2.1 IoT-Aware Three-Phase Query Execution 6 1.2.2 Streaming Dataflow Reshaping on Lambda 7 1.3 Contribution 8 1.4 Dissertation Structure 9 Chapter 2 Background 10 2.1 Stream Query Model 10 2.2 Workload Characteristics 12 2.2.1 Small IoT Stream Query 12 2.2.2 Big IoT Stream Query 13 Chapter 3 IoT-Aware Three-Phase Query Execution 15 3.1 Pluto Design Overview 16 3.2 Decoupling of Code and Query Submission 19 3.2.1 Code Registration 19 3.2.2 Query Submission API 20 3.3 IoT-Aware Execution Model 21 3.3.1 Q-Group Creation and Query Grouping 24 3.3.2 Q-Group Assignment 24 3.3.3 Q-Group Scheduling and Processing 25 3.3.4 Load Rebalancing: Q-Group Split and Merging 28 3.4 Implementation 29 3.5 Evaluation 30 3.5.1 Methodology 30 3.5.2 Performance Comparison 34 3.5.3 Performance Breakdown 36 3.5.4 Load Rebalancing: Q-Group Split and Merging 38 3.5.5 Tradeoff 40 3.6 Discussion 41 3.7 Related Work 43 3.8 Summary 44 Chapter 4 Streaming Dataflow Reshaping for Fast Scaling Mechanism on Lambda 46 4.1 Motivation 46 4.2 Challenges 47 4.3 Design Overview 50 4.4 Reshaping Rules 51 4.4.1 R1:Inserting Router Operators 52 4.4.2 R2:Inserting Transient Operators 54 4.4.3 R3:Inserting State Merger Operators 57 4.5 Scaling Protocol 59 4.5.1 Redirection Protocol 59 4.5.2 Merging Protocol 60 4.5.3 Migration Protocol 61 4.6 Implementation 61 4.7 Evaluation 63 4.7.1 Methodology 63 4.7.2 Performance Analysis 68 4.7.3 Performance Breakdown 70 4.7.4 Latency-Cost($) Trade-Off 76 4.8 Discussion 77 4.9 Related Work 78 4.10 Summary 80 Chapter 5 Conclusion 81๋ฐ•

    V-Cache: Towards Flexible Resource Provisioning for Multi-tier Applications in IaaS Clouds

    Full text link
    Abstractโ€”Although the resource elasticity offered by Infrastructure-as-a-Service (IaaS) clouds opens up opportunities for elastic application performance, it also poses challenges to application management. Cluster applications, such as multi-tier websites, further complicates the management requiring not only accurate capacity planning but also proper partitioning of the resources into a number of virtual machines. Instead of burdening cloud users with complex management, we move the task of determining the optimal resource configuration for cluster applications to cloud providers. We find that a structural reorganization of multi-tier websites, by adding a caching tier which runs on resources debited from the original resource budget, significantly boosts application performance and reduces resource usage. We propose V-Cache, a machine learning based approach to flexible provisioning of resources for multi-tier applications in clouds. V-Cache transparently places a caching proxy in front of the application. It uses a genetic algorithm to identify the incoming requests that benefit most from caching and dynamically resizes the cache space to accommodate these requests. We develop a reinforcement learning algorithm to optimally allocate the remaining capacity to other tiers. We have implemented V-Cache on a VMware-based cloud testbed. Exper-iment results with the RUBiS and WikiBench benchmarks show that V-Cache outperforms a representative capacity management scheme and a cloud-cache based resource provisioning approach by at least 15 % in performance, and achieves at least 11 % and 21 % savings on CPU and memory resources, respectively. I

    Energy Efficient Virtual Machine Services Placement in Cloud-Fog Architecture

    Get PDF
    The proliferation in data volume and processing requests calls for a new breed of on-demand computing. Fog computing is proposed to address the limitations of cloud computing by extending processing and storage resources to the edge of the network. Cloud and fog computing employ virtual machines (VMs) for efficient resource utilization. In order to optimize the virtual environment, VMs can be migrated or replicated over geo-distributed physical machines for load balancing and energy efficiency. In this work, we investigate the offloading of VM services from the cloud to the fog considering the British Telecom (BT) network topology. The analysis addresses the impact of different factors including the VM workload and the proximity of fog nodes to users considering the data rate of state-of-the-art applications. The results show that the optimum placement of VMs significantly decreases the total power consumption by up to 75% compared to a single cloud placement
    • โ€ฆ
    corecore