18,581 research outputs found
GraphLab: A New Framework for Parallel Machine Learning
Designing and implementing efficient, provably correct parallel machine
learning (ML) algorithms is challenging. Existing high-level parallel
abstractions like MapReduce are insufficiently expressive while low-level tools
like MPI and Pthreads leave ML experts repeatedly solving the same design
challenges. By targeting common patterns in ML, we developed GraphLab, which
improves upon abstractions like MapReduce by compactly expressing asynchronous
iterative algorithms with sparse computational dependencies while ensuring data
consistency and achieving a high degree of parallel performance. We demonstrate
the expressiveness of the GraphLab framework by designing and implementing
parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and
Compressed Sensing. We show that using GraphLab we can achieve excellent
parallel performance on large scale real-world problems
A New MHD Code with Adaptive Mesh Refinement and Parallelization for Astrophysics
A new code, named MAP, is written in Fortran language for
magnetohydrodynamics (MHD) calculation with the adaptive mesh refinement (AMR)
and Message Passing Interface (MPI) parallelization. There are several optional
numerical schemes for computing the MHD part, namely, modified Mac Cormack
Scheme (MMC), Lax-Friedrichs scheme (LF) and weighted essentially
non-oscillatory (WENO) scheme. All of them are second order, two-step,
component-wise schemes for hyperbolic conservative equations. The total
variation diminishing (TVD) limiters and approximate Riemann solvers are also
equipped. A high resolution can be achieved by the hierarchical
block-structured AMR mesh. We use the extended generalized Lagrange multiplier
(EGLM) MHD equations to reduce the non-divergence free error produced by the
scheme in the magnetic induction equation. The numerical algorithms for the
non-ideal terms, e.g., the resistivity and the thermal conduction, are also
equipped in the MAP code. The details of the AMR and MPI algorithms are
described in the paper.Comment: 44 pages, 16 figure
Byzantine Approximate Agreement on Graphs
Consider a distributed system with n processors out of which f can be Byzantine faulty. In the approximate agreement task, each processor i receives an input value x_i and has to decide on an output value y_i such that
1) the output values are in the convex hull of the non-faulty processors\u27 input values,
2) the output values are within distance d of each other.
Classically, the values are assumed to be from an m-dimensional Euclidean space, where m >= 1.
In this work, we study the task in a discrete setting, where input values with some structure expressible as a graph. Namely, the input values are vertices of a finite graph G and the goal is to output vertices that are within distance d of each other in G, but still remain in the graph-induced convex hull of the input values. For d=0, the task reduces to consensus and cannot be solved with a deterministic algorithm in an asynchronous system even with a single crash fault. For any d >= 1, we show that the task is solvable in asynchronous systems when G is chordal and n > (omega+1)f, where omega is the clique number of G. In addition, we give the first Byzantine-tolerant algorithm for a variant of lattice agreement. For synchronous systems, we show tight resilience bounds for the exact variants of these and related tasks over a large class of combinatorial structures
Novel methods for real-time 3D facial recognition
In this paper we discuss our approach to real-time 3D face recognition. We argue the need for real time operation in a realistic scenario and highlight the required pre- and post-processing operations for effective 3D facial recognition. We focus attention to some operations including face and eye detection, and fast post-processing operations such as hole filling, mesh smoothing and noise removal. We consider strategies for hole filling such as bilinear and polynomial interpolation and Laplace and conclude that bilinear interpolation is preferred. Gaussian and moving average smoothing strategies are compared and it is shown that moving average can have the edge over Gaussian smoothing. The regions around the eyes normally carry a considerable amount of noise and strategies for replacing the eyeball with a spherical surface and the use of an elliptical mask in conjunction with hole filling are compared. Results show that the elliptical mask with hole filling works well on face models and it is simpler to implement. Finally performance issues are considered and the system has demonstrated to be able to perform real-time 3D face recognition in just over 1s 200ms per face model for a small database
Hydrologic Terrain Processing Using Parallel Computing
Abstract: Topography in the form of Digital Elevation Models (DEMs), is widely used to derive information for the modeling of hydrologic processes. Hydrologic terrain analysis augments the information content of digital elevation data by removing spurious pits, deriving a structured flow field, and calculating surfaces of hydrologic information derived from the flow field. The increasing availability of large terrain datasets with very small ground sample distance (GSD) poses a challenge for existing algorithms that process terrain data to extract this hydrologic information. This paper will describe a parallel algorithm that has been developed to enhance hydrologic terrain pre-processing so that larger datasets can be more efficiently computed. This paper describes a Message Passing Interface (MPI) parallel implementation for Pit Removal. This key functionality is used within the Terrain Analysis Using Digital Elevation Models (TauDEM) package to remove spurious elevation depressions that are an artifact of the raster representation of the terrain. The parallel algorithm works by decomposing the domain into stripes or tiles where each tile is processed by a separate processor. This method also reduces the memory requirements of each processor so that larger size grids can be processed. The parallel pit removal algorithm is adapted from the method of Planchon and Darboux that starts from a large elevation then iteratively scans the grid, lowering each grid cell to the maximum of the original elevation or the lowest neighbor. The MPI implementation reconcile
Mapping constrained optimization problems to quantum annealing with application to fault diagnosis
Current quantum annealing (QA) hardware suffers from practical limitations
such as finite temperature, sparse connectivity, small qubit numbers, and
control error. We propose new algorithms for mapping boolean constraint
satisfaction problems (CSPs) onto QA hardware mitigating these limitations. In
particular we develop a new embedding algorithm for mapping a CSP onto a
hardware Ising model with a fixed sparse set of interactions, and propose two
new decomposition algorithms for solving problems too large to map directly
into hardware.
The mapping technique is locally-structured, as hardware compatible Ising
models are generated for each problem constraint, and variables appearing in
different constraints are chained together using ferromagnetic couplings. In
contrast, global embedding techniques generate a hardware independent Ising
model for all the constraints, and then use a minor-embedding algorithm to
generate a hardware compatible Ising model. We give an example of a class of
CSPs for which the scaling performance of D-Wave's QA hardware using the local
mapping technique is significantly better than global embedding.
We validate the approach by applying D-Wave's hardware to circuit-based
fault-diagnosis. For circuits that embed directly, we find that the hardware is
typically able to find all solutions from a min-fault diagnosis set of size N
using 1000N samples, using an annealing rate that is 25 times faster than a
leading SAT-based sampling method. Further, we apply decomposition algorithms
to find min-cardinality faults for circuits that are up to 5 times larger than
can be solved directly on current hardware.Comment: 22 pages, 4 figure
- …