61,267 research outputs found
DiPerF: an automated DIstributed PERformance testing Framework
We present DiPerF, a distributed performance testing framework, aimed at
simplifying and automating service performance evaluation. DiPerF coordinates a
pool of machines that test a target service, collects and aggregates
performance metrics, and generates performance statistics. The aggregate data
collected provide information on service throughput, on service "fairness" when
serving multiple clients concurrently, and on the impact of network latency on
service performance. Furthermore, using this data, it is possible to build
predictive models that estimate a service performance given the service load.
We have tested DiPerF on 100+ machines on two testbeds, Grid3 and PlanetLab,
and explored the performance of job submission services (pre WS GRAM and WS
GRAM) included with Globus Toolkit 3.2.Comment: 8 pages, 8 figures, will appear in IEEE/ACM Grid2004, November 200
Towards Hybrid Cloud-assisted Crowdsourced Live Streaming: Measurement and Analysis
Crowdsourced Live Streaming (CLS), most notably Twitch.tv, has seen explosive
growth in its popularity in the past few years. In such systems, any user can
lively broadcast video content of interest to others, e.g., from a game player
to many online viewers. To fulfill the demands from both massive and
heterogeneous broadcasters and viewers, expensive server clusters have been
deployed to provide video ingesting and transcoding services. Despite the
existence of highly popular channels, a significant portion of the channels is
indeed unpopular. Yet as our measurement shows, these broadcasters are
consuming considerable system resources; in particular, 25% (resp. 30%) of
bandwidth (resp. computation) resources are used by the broadcasters who do not
have any viewers at all. In this paper, we closely examine the challenge of
handling unpopular live-broadcasting channels in CLS systems and present a
comprehensive solution for service partitioning on hybrid cloud. The
trace-driven evaluation shows that our hybrid cloud-assisted design can smartly
assign ingesting and transcoding tasks to the elastic cloud virtual machines,
providing flexible system deployment cost-effectively
Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications
Many scientific problems require multiple distinct computational tasks to be
executed in order to achieve a desired solution. We introduce the Ensemble
Toolkit (EnTK) to address the challenges of scale, diversity and reliability
they pose. We describe the design and implementation of EnTK, characterize its
performance and integrate it with two distinct exemplar use cases: seismic
inversion and adaptive analog ensembles. We perform nine experiments,
characterizing EnTK overheads, strong and weak scalability, and the performance
of two use case implementations, at scale and on production infrastructures. We
show how EnTK meets the following general requirements: (i) implementing
dedicated abstractions to support the description and execution of ensemble
applications; (ii) support for execution on heterogeneous computing
infrastructures; (iii) efficient scalability up to O(10^4) tasks; and (iv)
fault tolerance. We discuss novel computational capabilities that EnTK enables
and the scientific advantages arising thereof. We propose EnTK as an important
addition to the suite of tools in support of production scientific computing
Measuring and Managing Answer Quality for Online Data-Intensive Services
Online data-intensive services parallelize query execution across distributed
software components. Interactive response time is a priority, so online query
executions return answers without waiting for slow running components to
finish. However, data from these slow components could lead to better answers.
We propose Ubora, an approach to measure the effect of slow running components
on the quality of answers. Ubora randomly samples online queries and executes
them twice. The first execution elides data from slow components and provides
fast online answers; the second execution waits for all components to complete.
Ubora uses memoization to speed up mature executions by replaying network
messages exchanged between components. Our systems-level implementation works
for a wide range of platforms, including Hadoop/Yarn, Apache Lucene, the
EasyRec Recommendation Engine, and the OpenEphyra question answering system.
Ubora computes answer quality much faster than competing approaches that do not
use memoization. With Ubora, we show that answer quality can and should be used
to guide online admission control. Our adaptive controller processed 37% more
queries than a competing controller guided by the rate of timeouts.Comment: Technical Repor
Analytical/ML Mixed Approach for Concurrency Regulation in Software Transactional Memory
In this article we exploit a combination of analytical and Machine Learning (ML) techniques in order to build a performance model allowing to dynamically tune the level of concurrency of applications based on Software Transactional Memory (STM). Our mixed approach has the advantage of reducing the training time of pure machine learning methods, and avoiding approximation errors typically affecting pure analytical approaches. Hence it allows very fast construction of highly reliable performance models, which can be promptly and effectively exploited for optimizing actual application runs. We also present a real implementation of a concurrency regulation architecture, based on the mixed modeling approach, which has been integrated with the open source Tiny STM package, together with experimental data related to runs of applications taken from the STAMP benchmark suite demonstrating the effectiveness of our proposal. © 2014 IEEE
Model-driven Scheduling for Distributed Stream Processing Systems
Distributed Stream Processing frameworks are being commonly used with the
evolution of Internet of Things(IoT). These frameworks are designed to adapt to
the dynamic input message rate by scaling in/out.Apache Storm, originally
developed by Twitter is a widely used stream processing engine while others
includes Flink, Spark streaming. For running the streaming applications
successfully there is need to know the optimal resource requirement, as
over-estimation of resources adds extra cost.So we need some strategy to come
up with the optimal resource requirement for a given streaming application. In
this article, we propose a model-driven approach for scheduling streaming
applications that effectively utilizes a priori knowledge of the applications
to provide predictable scheduling behavior. Specifically, we use application
performance models to offer reliable estimates of the resource allocation
required. Further, this intuition also drives resource mapping, and helps
narrow the estimated and actual dataflow performance and resource utilization.
Together, this model-driven scheduling approach gives a predictable application
performance and resource utilization behavior for executing a given DSPS
application at a target input stream rate on distributed resources.Comment: 54 page
Synapse: Synthetic Application Profiler and Emulator
We introduce Synapse motivated by the needs to estimate and emulate workload
execution characteristics on high-performance and distributed heterogeneous
resources. Synapse has a platform independent application profiler, and the
ability to emulate profiled workloads on a variety of heterogeneous resources.
Synapse is used as a proxy application (or "representative application") for
real workloads, with the added advantage that it can be tuned at arbitrary
levels of granularity in ways that are simply not possible using real
applications. Experiments show that automated profiling using Synapse
represents application characteristics with high fidelity. Emulation using
Synapse can reproduce the application behavior in the original runtime
environment, as well as reproducing properties when used in a different
run-time environments
- …