3,980 research outputs found
AiiDA: Automated Interactive Infrastructure and Database for Computational Science
Computational science has seen in the last decades a spectacular rise in the
scope, breadth, and depth of its efforts. Notwithstanding this prevalence and
impact, it is often still performed using the renaissance model of individual
artisans gathered in a workshop, under the guidance of an established
practitioner. Great benefits could follow instead from adopting concepts and
tools coming from computer science to manage, preserve, and share these
computational efforts. We illustrate here our paradigm sustaining such vision,
based around the four pillars of Automation, Data, Environment, and Sharing. We
then discuss its implementation in the open-source AiiDA platform
(http://www.aiida.net), that has been tuned first to the demands of
computational materials science. AiiDA's design is based on directed acyclic
graphs to track the provenance of data and calculations, and ensure
preservation and searchability. Remote computational resources are managed
transparently, and automation is coupled with data storage to ensure
reproducibility. Last, complex sequences of calculations can be encoded into
scientific workflows. We believe that AiiDA's design and its sharing
capabilities will encourage the creation of social ecosystems to disseminate
codes, data, and scientific workflows.Comment: 30 pages, 7 figure
Feasibility study of an Integrated Program for Aerospace vehicle Design (IPAD). Volume 4: IPAD system design
The computing system design of IPAD is described and the requirements which form the basis for the system design are discussed. The system is presented in terms of a functional design description and technical design specifications. The functional design specifications give the detailed description of the system design using top-down structured programming methodology. Human behavioral characteristics, which specify the system design at the user interface, security considerations, and standards for system design, implementation, and maintenance are also part of the technical design specifications. Detailed specifications of the two most common computing system types in use by the major aerospace companies which could support the IPAD system design are presented. The report of a study to investigate migration of IPAD software between the two candidate 3rd generation host computing systems and from these systems to a 4th generation system is included
Fast Parallel Operations on Search Trees
Using (a,b)-trees as an example, we show how to perform a parallel split with
logarithmic latency and parallel join, bulk updates, intersection, union (or
merge), and (symmetric) set difference with logarithmic latency and with
information theoretically optimal work. We present both asymptotically optimal
solutions and simplified versions that perform well in practice - they are
several times faster than previous implementations
Data structures
We discuss data structures and their methods of analysis. In particular, we treat the unweighted and weighted dictionary problem, self-organizing data structures, persistent data structures, the union-find-split problem, priority queues, the nearest common ancestor problem, the selection and merging problem, and dynamization techniques. The methods of analysis are worst, average and amortized case
Multiple-Edge-Fault-Tolerant Approximate Shortest-Path Trees
Let be an -node and -edge positively real-weighted undirected
graph. For any given integer , we study the problem of designing a
sparse \emph{f-edge-fault-tolerant} (-EFT) {\em -approximate
single-source shortest-path tree} (-ASPT), namely a subgraph of
having as few edges as possible and which, following the failure of a set
of at most edges in , contains paths from a fixed source that are
stretched at most by a factor of . To this respect, we provide an
algorithm that efficiently computes an -EFT -ASPT of size . Our structure improves on a previous related construction designed for
\emph{unweighted} graphs, having the same size but guaranteeing a larger
stretch factor of , plus an additive term of .
Then, we show how to convert our structure into an efficient -EFT
\emph{single-source distance oracle} (SSDO), that can be built in
time, has size , and is able to report,
after the failure of the edge set , in time a
-approximate distance from the source to any node, and a
corresponding approximate path in the same amount of time plus the path's size.
Such an oracle is obtained by handling another fundamental problem, namely that
of updating a \emph{minimum spanning forest} (MSF) of after that a
\emph{batch} of simultaneous edge modifications (i.e., edge insertions,
deletions and weight changes) is performed. For this problem, we build in time a \emph{sensitivity} oracle of size , that
reports in time the (at most ) edges either exiting from
or entering into the MSF. [...]Comment: 16 pages, 4 figure
Task-Driven Dictionary Learning
Modeling data with linear combinations of a few elements from a learned
dictionary has been the focus of much recent research in machine learning,
neuroscience and signal processing. For signals such as natural images that
admit such sparse representations, it is now well established that these models
are well suited to restoration tasks. In this context, learning the dictionary
amounts to solving a large-scale matrix factorization problem, which can be
done efficiently with classical optimization tools. The same approach has also
been used for learning features from data for other purposes, e.g., image
classification, but tuning the dictionary in a supervised way for these tasks
has proven to be more difficult. In this paper, we present a general
formulation for supervised dictionary learning adapted to a wide variety of
tasks, and present an efficient algorithm for solving the corresponding
optimization problem. Experiments on handwritten digit classification, digital
art identification, nonlinear inverse image problems, and compressed sensing
demonstrate that our approach is effective in large-scale settings, and is well
suited to supervised and semi-supervised classification, as well as regression
tasks for data that admit sparse representations.Comment: final draft post-refereein
- …