2,629 research outputs found

    DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams

    Full text link
    In a data stream management system (DSMS), users register continuous queries, and receive result updates as data arrive and expire. We focus on applications with real-time constraints, in which the user must receive each result update within a given period after the update occurs. To handle fast data, the DSMS is commonly placed on top of a cloud infrastructure. Because stream properties such as arrival rates can fluctuate unpredictably, cloud resources must be dynamically provisioned and scheduled accordingly to ensure real-time response. It is quite essential, for the existing systems or future developments, to possess the ability of scheduling resources dynamically according to the current workload, in order to avoid wasting resources, or failing in delivering correct results on time. Motivated by this, we propose DRS, a novel dynamic resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental challenges: (a) how to model the relationship between the provisioned resources and query response time (b) where to best place resources; and (c) how to measure system load with minimal overhead. In particular, DRS includes an accurate performance model based on the theory of \emph{Jackson open queueing networks} and is capable of handling \emph{arbitrary} operator topologies, possibly with loops, splits and joins. Extensive experiments with real data confirm that DRS achieves real-time response with close to optimal resource consumption.Comment: This is the our latest version with certain modificatio

    A small-scale testbed for large-scale reliable computing

    Get PDF
    High performance computing (HPC) systems frequently suffer errors and failures from hardware components that negatively impact the performance of jobs run on these systems. We analyzed system logs from two HPC systems at Purdue University and created statistical models for memory and hard disk errors. We created a small-scale error injection testbed—using a customized QEMU build, libvirt, and Python—for HPC application programmers to test and debug their programs in a faulty environment so that programmers can write more robust and resilient programs before deploying them on an actual HPC system. The deliverables for this project are the fault injection program, the modified QEMU source code, and the statistical models used for driving the injection

    Scientific Computing Meets Big Data Technology: An Astronomy Use Case

    Full text link
    Scientific analyses commonly compose multiple single-process programs into a dataflow. An end-to-end dataflow of single-process programs is known as a many-task application. Typically, tools from the HPC software stack are used to parallelize these analyses. In this work, we investigate an alternate approach that uses Apache Spark -- a modern big data platform -- to parallelize many-task applications. We present Kira, a flexible and distributed astronomy image processing toolkit using Apache Spark. We then use the Kira toolkit to implement a Source Extractor application for astronomy images, called Kira SE. With Kira SE as the use case, we study the programming flexibility, dataflow richness, scheduling capacity and performance of Apache Spark running on the EC2 cloud. By exploiting data locality, Kira SE achieves a 2.5x speedup over an equivalent C program when analyzing a 1TB dataset using 512 cores on the Amazon EC2 cloud. Furthermore, we show that by leveraging software originally designed for big data infrastructure, Kira SE achieves competitive performance to the C implementation running on the NERSC Edison supercomputer. Our experience with Kira indicates that emerging Big Data platforms such as Apache Spark are a performant alternative for many-task scientific applications

    Integrating a next-generation optical access network testbed into a large-scale virtual research testbed

    Get PDF
    Several experimental research networks have been created in the laboratories of prominent universities and research centres to assess new optical communication technologies. A greater value and research impact can be obtained from these testbeds by making them available to other researchers through research infrastructure federations such as GENI and/or FIRE. This is a challenging task due to the limitations of programmability of resource management and virtualisation software in most experimental optical networks. Fed4FIRE is an EU research project that makes it possible to create complex testbed scenarios that interconnect heterogeneous testbeds distributed physically all over the world. In this paper, we present a practical approach for the federation of a next-generation optical access testbed created at Stanford University called UltraFlow Access. That testbed offers its users both packet-switched and circuit-switched services while remaining compatible with conventional PONs. Our approach facilitates experimentation on the UltraFlow Access testbed in the context of large virtual testbeds using Fed4FIRE protocols.The research of this paper was partially financed by the European Union’s FP7 grant agreement no. 318389 Fed4FIRE Project, the National Science Foundation (grant no. 111174), NSERC, the Spanish projects CRAMnet (grant no. TEC2012-38362-C03- 01) and TIGRE5-CM (grant no. S2013/ICE-2919). The authors would also like to acknowledge the support of the Chair of Excellence of Bank of Santander at UC3M.European Community's Seventh Framework Progra
    • …
    corecore