63,487 research outputs found
A review of High Performance Computing foundations for scientists
The increase of existing computational capabilities has made simulation
emerge as a third discipline of Science, lying midway between experimental and
purely theoretical branches [1, 2]. Simulation enables the evaluation of
quantities which otherwise would not be accessible, helps to improve
experiments and provides new insights on systems which are analysed [3-6].
Knowing the fundamentals of computation can be very useful for scientists, for
it can help them to improve the performance of their theoretical models and
simulations. This review includes some technical essentials that can be useful
to this end, and it is devised as a complement for researchers whose education
is focused on scientific issues and not on technological respects. In this
document we attempt to discuss the fundamentals of High Performance Computing
(HPC) [7] in a way which is easy to understand without much previous
background. We sketch the way standard computers and supercomputers work, as
well as discuss distributed computing and discuss essential aspects to take
into account when running scientific calculations in computers.Comment: 33 page
uFLIP: Understanding Flash IO Patterns
Does the advent of flash devices constitute a radical change for secondary
storage? How should database systems adapt to this new form of secondary
storage? Before we can answer these questions, we need to fully understand the
performance characteristics of flash devices. More specifically, we want to
establish what kind of IOs should be favored (or avoided) when designing
algorithms and architectures for flash-based systems. In this paper, we focus
on flash IO patterns, that capture relevant distribution of IOs in time and
space, and our goal is to quantify their performance. We define uFLIP, a
benchmark for measuring the response time of flash IO patterns. We also present
a benchmarking methodology which takes into account the particular
characteristics of flash devices. Finally, we present the results obtained by
measuring eleven flash devices, and derive a set of design hints that should
drive the development of flash-based systems on current devices.Comment: CIDR 200
A Logical Model and Data Placement Strategies for MEMS Storage Devices
MEMS storage devices are new non-volatile secondary storages that have
outstanding advantages over magnetic disks. MEMS storage devices, however, are
much different from magnetic disks in the structure and access characteristics.
They have thousands of heads called probe tips and provide the following two
major access facilities: (1) flexibility: freely selecting a set of probe tips
for accessing data, (2) parallelism: simultaneously reading and writing data
with the set of probe tips selected. Due to these characteristics, it is
nontrivial to find data placements that fully utilize the capability of MEMS
storage devices. In this paper, we propose a simple logical model called the
Region-Sector (RS) model that abstracts major characteristics affecting data
retrieval performance, such as flexibility and parallelism, from the physical
MEMS storage model. We also suggest heuristic data placement strategies based
on the RS model and derive new data placements for relational data and
two-dimensional spatial data by using those strategies. Experimental results
show that the proposed data placements improve the data retrieval performance
by up to 4.0 times for relational data and by up to 4.8 times for
two-dimensional spatial data of approximately 320 Mbytes compared with those of
existing data placements. Further, these improvements are expected to be more
marked as the database size grows.Comment: 37 page
Performance Debugging and Tuning using an Instruction-Set Simulator
Instruction-set simulators allow programmers a detailed level of insight into,
and control over, the execution of a program, including parallel programs and
operating systems. In principle, instruction set simulation can model any
target computer and gather any statistic. Furthermore, such simulators are
usually portable, independent of compiler tools, and deterministic-allowing
bugs to be recreated or measurements repeated. Though often viewed as being
too slow for use as a general programming tool, in the last several years
their performance has improved considerably.
We describe SIMICS, an instruction set simulator of SPARC-based
multiprocessors developed at SICS, in its rôle as a general programming tool.
We discuss some of the benefits of using a tool such as SIMICS to support
various tasks in software engineering, including debugging, testing, analysis,
and performance tuning. We present in some detail two test cases, where we've
used SimICS to support analysis and performance tuning of two applications,
Penny and EQNTOTT. This work resulted in improved parallelism in, and
understanding of, Penny, as well as a performance improvement for EQNTOTT of
over a magnitude. We also present some early work on analyzing SPARC/Linux,
demonstrating the ability of tools like SimICS to analyze operating systems
Massively parallel approximate Gaussian process regression
We explore how the big-three computing paradigms -- symmetric multi-processor
(SMC), graphical processing units (GPUs), and cluster computing -- can together
be brought to bare on large-data Gaussian processes (GP) regression problems
via a careful implementation of a newly developed local approximation scheme.
Our methodological contribution focuses primarily on GPU computation, as this
requires the most care and also provides the largest performance boost.
However, in our empirical work we study the relative merits of all three
paradigms to determine how best to combine them. The paper concludes with two
case studies. One is a real data fluid-dynamics computer experiment which
benefits from the local nature of our approximation; the second is a synthetic
data example designed to find the largest design for which (accurate) GP
emulation can performed on a commensurate predictive set under an hour.Comment: 24 pages, 6 figures, 1 tabl
- …