5,335 research outputs found
SimGrid: a Sustained Effort for the Versatile Simulation of Large Scale Distributed Systems
In this paper we present Simgrid, a toolkit for the versatile simulation of
large scale distributed systems, whose development effort has been sustained
for the last fifteen years. Over this time period SimGrid has evolved from a
one-laboratory project in the U.S. into a scientific instrument developed by an
international collaboration. The keys to making this evolution possible have
been securing of funding, improving the quality of the software, and increasing
the user base. In this paper we describe how we have been able to make advances
on all three fronts, on which we plan to intensify our efforts over the
upcoming years.Comment: 4 pages, submission to WSSSPE'1
ASCR/HEP Exascale Requirements Review Report
This draft report summarizes and details the findings, results, and
recommendations derived from the ASCR/HEP Exascale Requirements Review meeting
held in June, 2015. The main conclusions are as follows. 1) Larger, more
capable computing and data facilities are needed to support HEP science goals
in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of
the demand at the 2025 timescale is at least two orders of magnitude -- and in
some cases greater -- than that available currently. 2) The growth rate of data
produced by simulations is overwhelming the current ability, of both facilities
and researchers, to store and analyze it. Additional resources and new
techniques for data analysis are urgently needed. 3) Data rates and volumes
from HEP experimental facilities are also straining the ability to store and
analyze large and complex data volumes. Appropriately configured
leadership-class facilities can play a transformational role in enabling
scientific discovery from these datasets. 4) A close integration of HPC
simulation and data analysis will aid greatly in interpreting results from HEP
experiments. Such an integration will minimize data movement and facilitate
interdependent workflows. 5) Long-range planning between HEP and ASCR will be
required to meet HEP's research needs. To best use ASCR HPC resources the
experimental HEP program needs a) an established long-term plan for access to
ASCR computational and data resources, b) an ability to map workflows onto HPC
resources, c) the ability for ASCR facilities to accommodate workflows run by
collaborations that can have thousands of individual members, d) to transition
codes to the next-generation HPC platforms that will be available at ASCR
facilities, e) to build up and train a workforce capable of developing and
using simulations and analysis to support HEP scientific research on
next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio
Prioritized Data Compression using Wavelets
The volume of data and the velocity with which it is being generated by com-
putational experiments on high performance computing (HPC) systems is quickly
outpacing our ability to effectively store this information in its full
fidelity. There- fore, it is critically important to identify and study
compression methodologies that retain as much information as possible,
particularly in the most salient regions of the simulation space. In this
paper, we cast this in terms of a general decision-theoretic problem and
discuss a wavelet-based compression strategy for its solution. We pro- vide a
heuristic argument as justification and illustrate our methodology on several
examples. Finally, we will discuss how our proposed methodology may be utilized
in an HPC environment on large-scale computational experiments
Recommended from our members
Computational Strategies for Scalable Genomics Analysis.
The revolution in next-generation DNA sequencing technologies is leading to explosive data growth in genomics, posing a significant challenge to the computing infrastructure and software algorithms for genomics analysis. Various big data technologies have been explored to scale up/out current bioinformatics solutions to mine the big genomics data. In this review, we survey some of these exciting developments in the applications of parallel distributed computing and special hardware to genomics. We comment on the pros and cons of each strategy in the context of ease of development, robustness, scalability, and efficiency. Although this review is written for an audience from the genomics and bioinformatics fields, it may also be informative for the audience of computer science with interests in genomics applications
- âŠ