10,499 research outputs found
Partitioning problems in parallel, pipelined and distributed computing
The problem of optimally assigning the modules of a parallel program over the processors of a multiple computer system is addressed. A Sum-Bottleneck path algorithm is developed that permits the efficient solution of many variants of this problem under some constraints on the structure of the partitions. In particular, the following problems are solved optimally for a single-host, multiple satellite system: partitioning multiple chain structured parallel programs, multiple arbitrarily structured serial programs and single tree structured parallel programs. In addition, the problems of partitioning chain structured parallel programs across chain connected systems and across shared memory (or shared bus) systems are also solved under certain constraints. All solutions for parallel programs are equally applicable to pipelined programs. These results extend prior research in this area by explicitly taking concurrency into account and permit the efficient utilization of multiple computer architectures for a wide range of problems of practical interest
TRIQS/CTHYB: A Continuous-Time Quantum Monte Carlo Hybridization Expansion Solver for Quantum Impurity Problems
We present TRIQS/CTHYB, a state-of-the art open-source implementation of the
continuous-time hybridisation expansion quantum impurity solver of the TRIQS
package. This code is mainly designed to be used with the TRIQS library in
order to solve the self-consistent quantum impurity problem in a multi-orbital
dynamical mean field theory approach to strongly-correlated electrons, in
particular in the context of realistic calculations. It is implemented in C++
for efficiency and is provided with a high-level Python interface. The code is
ships with a new partitioning algorithm that divides the local Hilbert space
without any user knowledge of the symmetries and quantum numbers of the
Hamiltonian. Furthermore, we implement higher-order configuration moves and
show that such moves are necessary to ensure ergodicity of the Monte Carlo in
common Hamiltonians even without symmetry-breaking.Comment: 19 pages, this is a companion article to that describing the TRIQS
librar
Adaptive Processing of Spatial-Keyword Data Over a Distributed Streaming Cluster
The widespread use of GPS-enabled smartphones along with the popularity of
micro-blogging and social networking applications, e.g., Twitter and Facebook,
has resulted in the generation of huge streams of geo-tagged textual data. Many
applications require real-time processing of these streams. For example,
location-based e-coupon and ad-targeting systems enable advertisers to register
millions of ads to millions of users. The number of users is typically very
high and they are continuously moving, and the ads change frequently as well.
Hence sending the right ad to the matching users is very challenging. Existing
streaming systems are either centralized or are not spatial-keyword aware, and
cannot efficiently support the processing of rapidly arriving spatial-keyword
data streams. This paper presents Tornado, a distributed spatial-keyword stream
processing system. Tornado features routing units to fairly distribute the
workload, and furthermore, co-locate the data objects and the corresponding
queries at the same processing units. The routing units use the Augmented-Grid,
a novel structure that is equipped with an efficient search algorithm for
distributing the data objects and queries. Tornado uses evaluators to process
the data objects against the queries. The routing units minimize the redundant
communication by not sending data updates for processing when these updates do
not match any query. By applying dynamically evaluated cost formulae that
continuously represent the processing overhead at each evaluator, Tornado is
adaptive to changes in the workload. Extensive experimental evaluation using
spatio-textual range queries over real Twitter data indicates that Tornado
outperforms the non-spatio-textually aware approaches by up to two orders of
magnitude in terms of the overall system throughput
Knowledge revision in systems based on an informed tree search strategy : application to cartographic generalisation
Many real world problems can be expressed as optimisation problems. Solving
this kind of problems means to find, among all possible solutions, the one that
maximises an evaluation function. One approach to solve this kind of problem is
to use an informed search strategy. The principle of this kind of strategy is
to use problem-specific knowledge beyond the definition of the problem itself
to find solutions more efficiently than with an uninformed strategy. This kind
of strategy demands to define problem-specific knowledge (heuristics). The
efficiency and the effectiveness of systems based on it directly depend on the
used knowledge quality. Unfortunately, acquiring and maintaining such knowledge
can be fastidious. The objective of the work presented in this paper is to
propose an automatic knowledge revision approach for systems based on an
informed tree search strategy. Our approach consists in analysing the system
execution logs and revising knowledge based on these logs by modelling the
revision problem as a knowledge space exploration problem. We present an
experiment we carried out in an application domain where informed search
strategies are often used: cartographic generalisation.Comment: Knowledge Revision; Problem Solving; Informed Tree Search Strategy;
Cartographic Generalisation., Paris : France (2008
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
The End of a Myth: Distributed Transactions Can Scale
The common wisdom is that distributed transactions do not scale. But what if
distributed transactions could be made scalable using the next generation of
networks and a redesign of distributed databases? There would be no need for
developers anymore to worry about co-partitioning schemes to achieve decent
performance. Application development would become easier as data placement
would no longer determine how scalable an application is. Hardware provisioning
would be simplified as the system administrator can expect a linear scale-out
when adding more machines rather than some complex sub-linear function, which
is highly application specific.
In this paper, we present the design of our novel scalable database system
NAM-DB and show that distributed transactions with the very common Snapshot
Isolation guarantee can indeed scale using the next generation of RDMA-enabled
network technology without any inherent bottlenecks. Our experiments with the
TPC-C benchmark show that our system scales linearly to over 6.5 million
new-order (14.5 million total) distributed transactions per second on 56
machines.Comment: 12 page
- …