2,629 research outputs found
DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams
In a data stream management system (DSMS), users register continuous queries,
and receive result updates as data arrive and expire. We focus on applications
with real-time constraints, in which the user must receive each result update
within a given period after the update occurs. To handle fast data, the DSMS is
commonly placed on top of a cloud infrastructure. Because stream properties
such as arrival rates can fluctuate unpredictably, cloud resources must be
dynamically provisioned and scheduled accordingly to ensure real-time response.
It is quite essential, for the existing systems or future developments, to
possess the ability of scheduling resources dynamically according to the
current workload, in order to avoid wasting resources, or failing in delivering
correct results on time. Motivated by this, we propose DRS, a novel dynamic
resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental
challenges: (a) how to model the relationship between the provisioned resources
and query response time (b) where to best place resources; and (c) how to
measure system load with minimal overhead. In particular, DRS includes an
accurate performance model based on the theory of \emph{Jackson open queueing
networks} and is capable of handling \emph{arbitrary} operator topologies,
possibly with loops, splits and joins. Extensive experiments with real data
confirm that DRS achieves real-time response with close to optimal resource
consumption.Comment: This is the our latest version with certain modificatio
A small-scale testbed for large-scale reliable computing
High performance computing (HPC) systems frequently suffer errors and failures from hardware components that negatively impact the performance of jobs run on these systems. We analyzed system logs from two HPC systems at Purdue University and created statistical models for memory and hard disk errors. We created a small-scale error injection testbed—using a customized QEMU build, libvirt, and Python—for HPC application programmers to test and debug their programs in a faulty environment so that programmers can write more robust and resilient programs before deploying them on an actual HPC system. The deliverables for this project are the fault injection program, the modified QEMU source code, and the statistical models used for driving the injection
Scientific Computing Meets Big Data Technology: An Astronomy Use Case
Scientific analyses commonly compose multiple single-process programs into a
dataflow. An end-to-end dataflow of single-process programs is known as a
many-task application. Typically, tools from the HPC software stack are used to
parallelize these analyses. In this work, we investigate an alternate approach
that uses Apache Spark -- a modern big data platform -- to parallelize
many-task applications. We present Kira, a flexible and distributed astronomy
image processing toolkit using Apache Spark. We then use the Kira toolkit to
implement a Source Extractor application for astronomy images, called Kira SE.
With Kira SE as the use case, we study the programming flexibility, dataflow
richness, scheduling capacity and performance of Apache Spark running on the
EC2 cloud. By exploiting data locality, Kira SE achieves a 2.5x speedup over an
equivalent C program when analyzing a 1TB dataset using 512 cores on the Amazon
EC2 cloud. Furthermore, we show that by leveraging software originally designed
for big data infrastructure, Kira SE achieves competitive performance to the C
implementation running on the NERSC Edison supercomputer. Our experience with
Kira indicates that emerging Big Data platforms such as Apache Spark are a
performant alternative for many-task scientific applications
Integrating a next-generation optical access network testbed into a large-scale virtual research testbed
Several experimental research networks have been created in the laboratories of prominent universities and research centres to assess new optical communication technologies. A greater value and research impact can be obtained from these testbeds by making them available to other researchers through research infrastructure federations such as GENI and/or FIRE. This is a challenging task due to the limitations of programmability of resource management and virtualisation software in most experimental optical networks. Fed4FIRE is an EU research project that makes it possible to create complex testbed scenarios that interconnect heterogeneous testbeds distributed physically all over the world. In this paper, we present a practical approach for the federation of a next-generation optical access testbed created at Stanford University called UltraFlow Access. That testbed offers its users both packet-switched and circuit-switched services while remaining compatible with conventional PONs. Our approach facilitates experimentation on the UltraFlow Access testbed in the context of large virtual testbeds using Fed4FIRE protocols.The research of this paper was partially financed by the European Union’s FP7 grant agreement no. 318389 Fed4FIRE Project, the National Science Foundation (grant no. 111174), NSERC, the Spanish projects CRAMnet (grant no. TEC2012-38362-C03- 01) and TIGRE5-CM (grant no. S2013/ICE-2919). The authors would also like to acknowledge the support of the Chair of Excellence of Bank of Santander at UC3M.European Community's Seventh Framework Progra
- …