10 research outputs found

    Elastic scaling of data parallel operators in stream processing

    Full text link

    Stela: on-demand elasticity in distributed data stream processing systems

    Get PDF
    Big data is characterized by volume and velocity [24], and recently several real- time stream processing systems have emerged to combat this challenge. These systems process streams of data in real time and computational results. However, current popular data stream processing systems lack the ability to scale out and scale in (i.e., increase or decrease the number of machines or VMs allocated to the application) efficiently and unintrusively when requested by the user on demand. In order to scale out/in, a critical problem that needs to be solved is to determine which operator(s) of the stream processing application need to be given more resources or taken resources away from, in order to maximize the application throughput. We do so by presenting a novel metric called "Expected Throughput Percentage" (ETP). ETP takes into account not only congested elements of the stream processing application but also their effect on downstream elements and on the overall application throughput. Next, we show how our new system, called Stela (STream processing ELAsticity), incorporates ETP in its scheduling strategy. Stela enables scale out and scale in operations on demand, and achieves the twin goals of optimizing post-scaling throughput and minimizing interference to throughput during the scaling out/in. We have integrated the implementation of Stela into Apache Storm [27], a popular data stream processing system. We conducted experiments on Stela using a set of micro benchmark topologies as well as two topologies from Yahoo! Inc. Our experiment results shows Stela achieves 5% to 120% higher post scale throughput comparing to default Storm scheduler performing scale out operations, and 40% to 500% of throughput improvement comparing to the default scheduler during scale in stage. This work is a joint project with Master student Boyang Peng [1]

    Scalable Distributed Service Integrity Attestation for Software-as-a-Service Clouds

    Full text link

    Elasticity and resource aware scheduling in distributed data stream processing systems

    Get PDF
    The era of big data has led to the emergence of new systems for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in industry today. However, Storm, like many other stream processing systems, lacks many important and desired features. One important feature is elasticity with clusters running Storm, i.e. change the cluster size on demand. Since the current Storm scheduler uses a naïve round robin approach in scheduling applications, another important feature is for Storm to have an intelligent scheduler that efficiently uses the underlying hardware by taking into account resource demand and resource availability when performing a scheduling. Both are important features that can make Storm a more robust and efficient system. Even though our target system is Storm, the techniques we have developed can be used in other similar stream processing systems. We have created a system called Stela that we implemented in Storm, which can perform on-demand scale-out and scale-in operations in distributed processing systems. Stela is minimally intrusive and disruptive for running jobs. Stela maximizes performance improvement for scale-out operations and minimally decrease performance for scale-in operations while not changing existing scheduling of jobs. Stela was developed in partnership with another Master’s Student, Le Xu [1]. We have created a system called R-Storm that does intelligent resource aware scheduling within Storm. The default round-robin scheduling mechanism currently deployed in Storm disregards resource demands and availability, and can therefore be very inefficient at times. R-Storm is designed to maximize resource utilization while minimizing network latency. When scheduling tasks, R-Storm can satisfy both soft and hard resource constraints as well as minimizing network distance between components that communicate with each other. The problem of mapping tasks to machines can be reduced to Quadratic Multiple 3-Dimensional Knapsack Problem, which is an NP-hard problem. However, our proposed scheduling algorithm within R-Storm attempts to bypass the limitation associated with NP-hard class of problems. We evaluate the performance of both Stela and R-Storm through our implementations of them in Storm by using several micro-benchmark Storm topologies and Storm topologies in use by Yahoo! In. Our experiments show that compared to Apache Storm’s default scheduler, Stela’s scale-out operation reduces interruption time to as low as 12.5% and achieves throughput that is 45-120% higher than Storm’s. And for scale-in operations, Stela achieves almost zero throughput post scale reduction while two other groups experience 200% and 50% throughput decrease respectively. For R-Storm, we observed that schedulings of topologies done by R-Storm perform on average 50%-100% better than that done by Storm’s default scheduler

    Challenges and experience in prototyping a multi-modal stream analytic and monitoring application on System S

    No full text
    In this paper, we describe the challenges of prototyping a reference application on System S, a distributed stream processing middleware under development at IBM Research. With a large number of stream PEs (Processing Elements) implementing various stream analytic algorithms, running on a large-scale, distributed cluster of nodes, and collaboratively digesting several multi-modal source streams with vastly differing rates, prototyping a reference application on System S faces many challenges. Specifically, we focus on our experience in prototyping DAC (Disaster Assistance Claim monitoring), a reference application dealing with multi-modal stream analytic and monitoring. We describe three critical challenges: (1) How do we generate correlated, multi-modal source streams for DAC? (2) How do we design and implement a comprehensive stream application, like DAC, from many divergent stream analytic PEs? (3) How do we deploy DAC in light of source streams with extremely different rates? We report our experience in addressing these challenges, including modeling a disaster claim processing center to generate correlated source streams, constructing the PE flow graph, utilizing programming supports from System S, adopting parallelism, and exploiting resource-adaptive computation. 1

    Elastic techniques to handle dynamism in real-time data processing systems

    Get PDF
    Real-time data processing is a crucial component of cloud computing today. It is widely adopted to provide an up-to-date view of data for social networks, cloud management, web applications, edge, and IoT infrastructures. Real-time processing frameworks are designed for time-sensitive tasks such as event detection, real-time data analysis, and prediction. Compared to handling offline, batched data, real-time data processing applications tend to be long-running and are prone to performance issues caused by many unpredictable environmental variables, including (but not limited to) job specification, user expectation, and available resources. In order to cope with this challenge, it is crucial for system designers to improve frameworks’ ability to adjust their resource usage to adapt to changing environmental variables, defined as system elasticity. This thesis investigates how elastic resource provisioning helps cloud systems today process real-time data while maintaining predictable performance under workload influence in an automated manner. We explore new algorithms, framework design, and efficient system implementation to achieve this goal. On the other hand, distributed systems today need to continuously handle various application specifications, hardware configurations, and workload characteristics. Maintaining stable performance requires systems to explicitly plan for resource allocation upon starting an application and tailor allocation dynamically during run time. In this thesis, we show how achieving system elasticity can help systems provide tunable performance under the dynamism of many environmental variables without compromising resource efficiency. Specifically, this thesis focuses on the two following aspects: i) Elasticity-aware Scheduling: Real-time data processing systems today are often designed in resource-, workload-agnostic fashion. As a result, most users are unable to perform resource planning before launching an application or adjust resource allocation (both within and across application boundaries) intelligently during the run. The first part of this thesis work (Stela [1], Henge [2], Getafix [3]) explores efficient mechanisms to conduct performance analysis while also enabling elasticity-aware scheduling in today’s cloud frameworks. ii) Resource Efficient Cloud Stack: The second line of work in this thesis aims to improve underlying cloud stacks to support self-adaptive, highly efficient resource provisioning. Today’s cloud systems enforce full isolation that prevents resource sharing among applications at a fine granularity over time. This work (Cameo [4], Dirigo) builds real- time data processing systems for emerging cloud infrastructures with high resource utilization through fine-grained resource sharing. Given that the market for real-time data analysis is expected to increase by the annual rate of 28.2% and reach 35.5 billion by the year 2024 [5], improving system elasticity can introduce a significant reduction to deployment cost and increase in resource utilization. Our works improve the performances of real-time data analytics applications within resource constraints. We highlight some of the improvements as the following: i) Stela explores elastic techniques for single-tenant, on-demand dataflow scale-out and scale-in operations. It improves post-scale throughput by 45-120% during on-demand scale-out and post-scale throughput by 2-5× during on-demand scale-in. ii) Henge develops a mechanism to map application’s performance into a unified scale of resource needs. It reduces resource consumption by 40-60% by maintaining the same level of SLO achievement throughout the cluster. iii) Getafix implements a strategy to analyze workload dynamically and proposes a solution that guides the systems to calculate the number of replicas to generate and the placement plan of these replicas adaptively. It achieves comparable query latency (both average and tail) by achieving 1.45-2.15× memory savings. iv) Cameo proposes a scheduler that supports data-driven, fine-grained operator execution guided by user expectations. It improves cluster utilization by 6× and reduces the performance violation by 72% while compacting more jobs into a shared cluster. v) Dirigo performs fully decentralized, function state-aware, global message scheduling for stateful functions. It is able to reduce tail latency by 60% compared to the local scheduling approach and reduce remote state accesses by 19× compared to the scheduling approach that is unaware of function states. These works can potentially lead to profound cost savings for both cloud providers and end-users

    ストリーム指向プログラムのための並列性抽出アルゴリズムに関する研究

    Get PDF
    筑波大学 (University of Tsukuba)201

    Task Scheduling in Data Stream Processing Systems

    Get PDF
    In the era of big data, with streaming applications such as social media, surveillance monitoring and real-time search generating large volumes of data, efficient Data Stream Processing Systems (DSPSs) have become essential. When designing an efficient DSPS, a number of challenges need to be considered including task allocation, scalability, fault tolerance, QoS, parallelism degree, and state management, among others. In our research, we focus on task allocation as it has a significant impact on performance metrics such as data processing latency and system throughput. An application processed by DSPSs is represented as a Directed Acyclic Graph (DAG), where each vertex represents a task and the edges show the dataflow between the tasks. Task allocation can be defined as the assignment of the vertices in the DAG to the physical compute nodes such that the data movement between the nodes is minimised. Finding an optimal task placement for stream processing systems is NP-hard. Thus, approximate scheduling approaches are required to improve the performance of DSPSs. In this thesis, we present our three proposed schedulers, each having a different heuristic partitioning approach to minimise inter-node communication for either homogeneous or heterogeneous clusters. We demonstrate how each scheduler can efficiently assign groups of highly communicating tasks to compute nodes. Our schedulers are able to outperform two state-of-the-art schedulers for three micro-benchmarks and two real-world applications, increasing throughput and reducing data processing latency as a result of a better task placement
    corecore