76,221 research outputs found
The Grid[Way] Job Template Manager, a tool for parameter sweeping
Parameter sweeping is a widely used algorithmic technique in computational
science. It is specially suited for high-throughput computing since the jobs
evaluating the parameter space are loosely coupled or independent.
A tool that integrates the modeling of a parameter study with the control of
jobs in a distributed architecture is presented. The main task is to facilitate
the creation and deletion of job templates, which are the elements describing
the jobs to be run. Extra functionality relies upon the GridWay Metascheduler,
acting as the middleware layer for job submission and control. It supports
interesting features like multi-dimensional sweeping space, wildcarding of
parameters, functional evaluation of ranges, value-skipping and job template
automatic indexation.
The use of this tool increases the reliability of the parameter sweep study
thanks to the systematic bookkeping of job templates and respective job
statuses. Furthermore, it simplifies the porting of the target application to
the grid reducing the required amount of time and effort.Comment: 26 pages, 1 figure
Avida: a software platform for research in computational evolutionary biology
Avida is a software platform for experiments with self-replicating and evolving computer programs. It provides detailed control over experimental settings and protocols, a large array of measurement tools, and sophisticated methods to analyze and post-process experimental data. We explain the general principles on which Avida is built, as well as its main components and their interactions. We also explain how experiments are set up, carried out, and analyzed
Fast filtering and animation of large dynamic networks
Detecting and visualizing what are the most relevant changes in an evolving
network is an open challenge in several domains. We present a fast algorithm
that filters subsets of the strongest nodes and edges representing an evolving
weighted graph and visualize it by either creating a movie, or by streaming it
to an interactive network visualization tool. The algorithm is an approximation
of exponential sliding time-window that scales linearly with the number of
interactions. We compare the algorithm against rectangular and exponential
sliding time-window methods. Our network filtering algorithm: i) captures
persistent trends in the structure of dynamic weighted networks, ii) smoothens
transitions between the snapshots of dynamic network, and iii) uses limited
memory and processor time. The algorithm is publicly available as open-source
software.Comment: 6 figures, 2 table
ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization
ROOT is an object-oriented C++ framework conceived in the high-energy physics
(HEP) community, designed for storing and analyzing petabytes of data in an
efficient way. Any instance of a C++ class can be stored into a ROOT file in a
machine-independent compressed binary format. In ROOT the TTree object
container is optimized for statistical data analysis over very large data sets
by using vertical data storage techniques. These containers can span a large
number of files on local disks, the web, or a number of different shared file
systems. In order to analyze this data, the user can chose out of a wide set of
mathematical and statistical functions, including linear algebra classes,
numerical algorithms such as integration and minimization, and various methods
for performing regression analysis (fitting). In particular, ROOT offers
packages for complex data modeling and fitting, as well as multivariate
classification based on machine learning techniques. A central piece in these
analysis tools are the histogram classes which provide binning of one- and
multi-dimensional data. Results can be saved in high-quality graphical formats
like Postscript and PDF or in bitmap formats like JPG or GIF. The result can
also be stored into ROOT macros that allow a full recreation and rework of the
graphics. Users typically create their analysis macros step by step, making use
of the interactive C++ interpreter CINT, while running over small data samples.
Once the development is finished, they can run these macros at full compiled
speed over large data sets, using on-the-fly compilation, or by creating a
stand-alone batch program. Finally, if processing farms are available, the user
can reduce the execution time of intrinsically parallel tasks - e.g. data
mining in HEP - by using PROOF, which will take care of optimally distributing
the work over the available resources in a transparent way
- …