68,227 research outputs found
Predicting Intermediate Storage Performance for Workflow Applications
Configuring a storage system to better serve an application is a challenging
task complicated by a multidimensional, discrete configuration space and the
high cost of space exploration (e.g., by running the application with different
storage configurations). To enable selecting the best configuration in a
reasonable time, we design an end-to-end performance prediction mechanism that
estimates the turn-around time of an application using storage system under a
given configuration. This approach focuses on a generic object-based storage
system design, supports exploring the impact of optimizations targeting
workflow applications (e.g., various data placement schemes) in addition to
other, more traditional, configuration knobs (e.g., stripe size or replication
level), and models the system operation at data-chunk and control message
level.
This paper presents our experience to date with designing and using this
prediction mechanism. We evaluate this mechanism using micro- as well as
synthetic benchmarks mimicking real workflow applications, and a real
application.. A preliminary evaluation shows that we are on a good track to
meet our objectives: it can scale to model a workflow application run on an
entire cluster while offering an over 200x speedup factor (normalized by
resource) compared to running the actual application, and can achieve, in the
limited number of scenarios we study, a prediction accuracy that enables
identifying the best storage system configuration
A Factor Framework for Experimental Design for Performance Evaluation of Commercial Cloud Services
Given the diversity of commercial Cloud services, performance evaluations of
candidate services would be crucial and beneficial for both service customers
(e.g. cost-benefit analysis) and providers (e.g. direction of service
improvement). Before an evaluation implementation, the selection of suitable
factors (also called parameters or variables) plays a prerequisite role in
designing evaluation experiments. However, there seems a lack of systematic
approaches to factor selection for Cloud services performance evaluation. In
other words, evaluators randomly and intuitively concerned experimental factors
in most of the existing evaluation studies. Based on our previous taxonomy and
modeling work, this paper proposes a factor framework for experimental design
for performance evaluation of commercial Cloud services. This framework
capsules the state-of-the-practice of performance evaluation factors that
people currently take into account in the Cloud Computing domain, and in turn
can help facilitate designing new experiments for evaluating Cloud services.Comment: 8 pages, Proceedings of the 4th International Conference on Cloud
Computing Technology and Science (CloudCom 2012), pp. 169-176, Taipei,
Taiwan, December 03-06, 201
Cold Storage Data Archives: More Than Just a Bunch of Tapes
The abundance of available sensor and derived data from large scientific
experiments, such as earth observation programs, radio astronomy sky surveys,
and high-energy physics already exceeds the storage hardware globally
fabricated per year. To that end, cold storage data archives are the---often
overlooked---spearheads of modern big data analytics in scientific,
data-intensive application domains. While high-performance data analytics has
received much attention from the research community, the growing number of
problems in designing and deploying cold storage archives has only received
very little attention.
In this paper, we take the first step towards bridging this gap in knowledge
by presenting an analysis of four real-world cold storage archives from three
different application domains. In doing so, we highlight (i) workload
characteristics that differentiate these archives from traditional,
performance-sensitive data analytics, (ii) design trade-offs involved in
building cold storage systems for these archives, and (iii) deployment
trade-offs with respect to migration to the public cloud. Based on our
analysis, we discuss several other important research challenges that need to
be addressed by the data management community
A Massively Scalable Architecture For Instant Messaging & Presence
This paper analyzes the scalability of Instant Messaging & Presence (IM&P) architectures. We take a queueing-based modelling and analysis approach to ïŹnd the bottlenecks of the current IM&P architecture at the Dutch social network Hyves, as well as of alternative architectures. We use the Hierarchical Evaluation Tool (HIT) to create and analyse models analytically. Based on these results, we recommend a new architecture that provides better scalability than the current one. \u
Self-Learning Hot Data Prediction: Where Echo State Network Meets NAND Flash Memories
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Well understanding the access behavior of hot data is significant for NAND flash memory due to its crucial impact on the efficiency of garbage collection (GC) and wear leveling (WL), which respectively dominate the performance and life span of SSD. Generally, both GC and WL rely greatly on the recognition accuracy of hot data identification (HDI). However, in this paper, the first time we propose a novel concept of hot data prediction (HDP), where the conventional HDI becomes unnecessary. First, we develop a hybrid optimized echo state network (HOESN), where sufficiently unbiased and continuously shrunk output weights are learnt by a sparse regression based on L2 and L1/2 regularization. Second, quantum-behaved particle swarm optimization (QPSO) is employed to compute reservoir parameters (i.e., global scaling factor, reservoir size, scaling coefficient and sparsity degree) for further improving prediction accuracy and reliability. Third, in the test on a chaotic benchmark (Rossler), the HOESN performs better than those of six recent state-of-the-art methods. Finally, simulation results about six typical metrics tested on five real disk workloads and on-chip experiment outcomes verified from an actual SSD prototype indicate that our HOESN-based HDP can reliably promote the access performance and endurance of NAND flash memories.Peer reviewe
Early Observations on Performance of Google Compute Engine for Scientific Computing
Although Cloud computing emerged for business applications in industry,
public Cloud services have been widely accepted and encouraged for scientific
computing in academia. The recently available Google Compute Engine (GCE) is
claimed to support high-performance and computationally intensive tasks, while
little evaluation studies can be found to reveal GCE's scientific capabilities.
Considering that fundamental performance benchmarking is the strategy of
early-stage evaluation of new Cloud services, we followed the Cloud Evaluation
Experiment Methodology (CEEM) to benchmark GCE and also compare it with Amazon
EC2, to help understand the elementary capability of GCE for dealing with
scientific problems. The experimental results and analyses show both potential
advantages of, and possible threats to applying GCE to scientific computing.
For example, compared to Amazon's EC2 service, GCE may better suit applications
that require frequent disk operations, while it may not be ready yet for single
VM-based parallel computing. Following the same evaluation methodology,
different evaluators can replicate and/or supplement this fundamental
evaluation of GCE. Based on the fundamental evaluation results, suitable GCE
environments can be further established for case studies of solving real
science problems.Comment: Proceedings of the 5th International Conference on Cloud Computing
Technologies and Science (CloudCom 2013), pp. 1-8, Bristol, UK, December 2-5,
201
Flexible resources allocation techniques: characteristics and modelling
At the interface between engineering, economics, social sciences and humanities, industrial engineering aims to provide answers to various sectors of business problems. One of these problems is the adjustment between the workload needed by the work to be realised and the availability of the company resources. The objective of this work is to help to find a methodology for the allocation of flexible human resources in industrial activities planning and scheduling. This model takes into account two levers of flexibility, one related to the working time modulation, and the other to the varieties of tasks that can be performed by a given resource (multiâskilled actor). On the one hand, multiâskilled actors will help to guide the various choices of the allocation to appreciate the impact of these choices on the tasks durations. On the other hand, the working time modulation that allows actors to have a work planning varying according to the workload which the company has to face
- âŠ