Search CORE

1,372 research outputs found

On the implementation of P-RAM algorithms on feasible SIMD computers

Author: Ziani Ridha
Publication venue
Publication date
Field of study

The P-RAM model of computation has proved to be a very useful theoretical model for exploiting and extracting inherent parallelism in problems and thus for designing parallel algorithms. Therefore, it becomes very important to examine whether results obtained for such a model can be translated onto machines considered to be more realistic in the face of current technological constraints. In this thesis, we show how the implementation of many techniques and algorithms designed for the P-RAM can be achieved on the feasible SIMD class of computers. The first investigation concerns classes of problems solvable on the P-RAM model using the recursive techniques of compression, tree contraction and 'divide and conquer'. For such problems, specific methods are emphasised to achieve efficient implementations on some SIMD architectures. Problems such as list ranking, polynomial and expression evaluation are shown to have efficient solutions on the 2—dimensional mesh-connected computer. The balanced binary tree technique is widely employed to solve many problems in the P-RAM model. By proposing an implicit embedding of the binary tree of size n on a (√n x√n) mesh-connected computer (contrary to using the usual H-tree approach which requires a mesh of size ≈ (2√n x 2√n), we show that many of the problems solvable using this technique can be efficiently implementable on this architecture. Two efficient O (√n) algorithms for solving the bracket matching problem are presented. Consequently, the problems of expression evaluation (where the expression is given in an array form), evaluating algebraic expressions with a carrier of constant bounded size and parsing expressions of both bracket and input driven languages are all shown to have efficient solutions on the 2—dimensional mesh-connected computer. Dealing with non-tree structured computations we show that the Eulerian tour problem for a given graph with m edges and maximum vertex degree d can be solved in O(d√n) parallel time on the 2 —dimensional mesh-connected computer. A way to increase the processor utilisation on the 2-dimensional mesh-connected computer is also presented. The method suggested consists of pipelining sets of iteratively solvable problems each of which at each step of its execution uses only a fraction of available PE's

Warwick Research Archives Portal Repository

Recent Advances in Graph Partitioning

Author: A Buluç
A Felner
A George
A Lisser
A Pothen
A Trifunović
AB Kahng
AE Feldmann
AH Land
AJ Soper
B Brandfass
B Hendrickson
B Hendrickson
B Hendrickson
B Junker
B Monien
B Peng
BW Kernighan
C Aykanat
C Chevalier
C Chevalier
C Farhat
C Lanczos
C Walshaw
C Walshaw
C Walshaw
C Walshaw
C Walshaw
C Walshaw
CE Bichot
CE Ferreira
D Delling
D Delling
D Delling
D Drake
D Luxen
D Ron
D Ron
D Wagner
DA Papa
DE Drake Vinkemeier
E Jeannot
E Rolland
F Comellas
F Glover
F Glover
F Pellegrini
F Pellegrini
F Pellegrini
F Schulz
FT Leighton
G Even
G Karypis
G Karypis
G Karypis
G Zumbusch
H Li
H Meyerhenke
H Meyerhenke
H Meyerhenke
H Meyerhenke
H Meyerhenke
HD Simon
HD Simon
I Moulitsas
I Safro
I Safro
J Chen
J Cong
J Fietz
J Hromkovič
J Hungershöfer
J Maue
J Maue
J Shalf
JR Gilbert
K Andreev
K Lang
K Schloegel
K Schloegel
K Schloegel
KS Camilus
L Brunetta
L Grady
L Lovász
LA Sanchis
LR Ford
M Armbruster
M Bader
M Birn
M Fiedler
M Jerrum
M Newman
M Sellmann
M Zhou
MR Garey
N Sensen
O Goldschmidt
P Chardaire
P Galinier
P Korosec
P Sanders
P Sanders
R Diekmann
R Diekmann
R Glantz
R Preis
RD Williams
S Arora
S Huang
S Lafon
S Lloyd
S Pettie
SE Karisch
SY Chan
T Bui
T Kieritz
U Benlic
U Benlic
U Feige
V Osipov
WE Donath
WE Donath
WW Hager
WW Hager
X Sui
Y Low
YM Kim
Ü Çatalyürek
Publication venue
Publication date: 03/02/2015
Field of study

We survey recent trends in practical algorithms for balanced graph partitioning together with applications and future research directions

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Exercises in parallel combinatorial computing

Author: Kindervater G.A.P. (Gerard)
Publication venue
Publication date: 15/06/1989
Field of study

CWI's Institutional Repository

Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

Author: Korenblum Daniel
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

arXiv.org e-Print Archive

Directory of Open Access Journals

Recommended from our members

Parallel data compression

Author: Hirschberg Daniel S.
Stauffer Lynn M.
Publication venue: eScholarship, University of California
Publication date: 01/05/1991
Field of study

Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested

eScholarship - University of California

Differential-Algebraic Equations

Author
Publication venue: Zürich : EMS Publ. House
Publication date: 01/01/2006
Field of study

Differential-Algebraic Equations (DAE) are today an independent field of research, which is gaining in importance and becoming of increasing interest for applications and mathematics itself. This workshop has drawn the balance after about 25 years investigations of DAEs and the research aims of the future were intensively discussed

Repositorium für Naturwissenschaften und Technik

Discrete Optimization in Early Vision - Model Tractability Versus Fidelity

Author: Strandmark Petter
Publication venue: Centre for Mathematical Sciences, Lund University
Publication date: 01/01/2012
Field of study

Early vision is the process occurring before any semantic interpretation of an image takes place. Motion estimation, object segmentation and detection are all parts of early vision, but recognition is not. Some models in early vision are easy to perform inference with---they are tractable. Others describe the reality well---they have high fidelity. This thesis improves the tractability-fidelity trade-off of the current state of the art by introducing new discrete methods for image segmentation and other problems of early vision. The first part studies pseudo-boolean optimization, both from a theoretical perspective as well as a practical one by introducing new algorithms. The main result is the generalization of the roof duality concept to polynomials of higher degree than two. Another focus is parallelization; discrete optimization methods for multi-core processors, computer clusters, and graphical processing units are presented. Remaining in an image segmentation context, the second part studies parametric problems where a set of model parameters and a segmentation are estimated simultaneously. For a small number of parameters these problems can still be optimally solved. One application is an optimal method for solving the two-phase Mumford-Shah functional. The third part shifts the focus to curvature regularization---where the commonly used length and area penalization is replaced by curvature in two and three dimensions. These problems can be discretized over a mesh and special attention is given to the mesh geometry. Specifically, hexagonal meshes in the plane are compared to square ones and a method for generating adaptive meshes is introduced and evaluated. The framework is then extended to curvature regularization of surfaces. Finally, the thesis is concluded by three applications to early vision problems: cardiac MRI segmentation, image registration, and cell classification

Lund University Publications

Development of a Navier-Stokes algorithm for parallel-processing supercomputers

Author: Swisshelm Julie M.
Publication venue
Publication date
Field of study

An explicit flow solver, applicable to the hierarchy of model equations ranging from Euler to full Navier-Stokes, is combined with several techniques designed to reduce computational expense. The computational domain consists of local grid refinements embedded in a global coarse mesh, where the locations of these refinements are defined by the physics of the flow. Flow characteristics are also used to determine which set of model equations is appropriate for solution in each region, thereby reducing not only the number of grid points at which the solution must be obtained, but also the computational effort required to get that solution. Acceleration to steady-state is achieved by applying multigrid on each of the subgrids, regardless of the particular model equations being solved. Since each of these components is explicit, advantage can readily be taken of the vector- and parallel-processing capabilities of machines such as the Cray X-MP and Cray-2

NASA Technical Reports Server

Some Optimally Adaptive Parallel Graph Algorithms on EREW PRAM Model

Author: Das Sajal K.
Publication venue: University of Central Florida
Publication date: 01/01/1988
Field of study

The study of graph algorithms is an important area of research in computer science, since graphs offer useful tools to model many real-world situations. The commercial availability of parallel computers have led to the development of efficient parallel graph algorithms. Using an exclusive-read and exclusive-write (EREW) parallel random access machine (PRAM) as the computation model with a fixed number of processors, we design and analyze parallel algorithms for seven undirected graph problems, such as, connected components, spanning forest, fundamental cycle set, bridges, bipartiteness, assignment problems, and approximate vertex coloring. For all but the last two problems, the input data structure is an unordered list of edges, and divide-and-conquer is the paradigm for designing algorithms. One of the algorithms to solve the assignment problem makes use of an appropriate variant of dynamic programming strategy. An elegant data structure, called the adjacency list matrix, used in a vertex-coloring algorithm avoids the sequential nature of linked adjacency lists. Each of the proposed algorithms achieves optimal speedup, choosing an optimal granularity (thus exploiting maximum parallelism) which depends on the density or the number of vertices of the given graph. The processor-(time)2 product has been identified as a useful parameter to measure the cost-effectiveness of a parallel algorithm. We derive a lower bound on this measure for each of our algorithms

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Über die Implementierung der verallgemeinerten Finite-Element-Methode

Author: Schwebke Kai G.
Publication venue: Universität der Bundeswehr München, Fakultät für Bauingenieurwesen und Umweltwissenschaften
Publication date: 01/01/2008
Field of study

The Generalized Finite Element Method (GFEM) combines desirable features of the standard Finite Element Method and the meshless methods. The key difference of the GFEM compared to the traditional FEM is the construction of the ansatz space. Each node of the finite element mesh carries a number of ansatz functions, expressed in terms of the global coordinate system. Those ansatz functions are multiplied by a partition of unity and serve as element ansatz functions in the patch constituted by the elements incident at the node. Using this technique to create the ansatz space allows for arbitrary ansatz functions. C0-continuity is enforced by construction. The ansatz is enriched using analytical functions or numerical approximations derived from side calculations containing a-priori knowledge of the solution close to singularities. The performance of GFEM with a higher order of polynomial ansatz functions is compared to traditional h-, p- and hp-extensions of the FEM. Most of the efficient solvers, e.g. multi-grid or cg, cannot be applied to the semi-definite systems resulting from a GFEM discretization. Several solving strategies are evaluated for higher order GFEM. The work concludes with a description of the implementation of the GFEM with a flexible object-oriented framework using C++.Die verallgemeinerte Finite-Element-Methode (GFEM) kombiniert Vorteile der klassischen Finite-Element-Methode mit Vorteilen der netzfreien Methoden. Hauptunterschied beim Vergleich der GFEM mit der FEM ist die Konstruktions des Ansatzes. Jeder Knoten des FE-Netzes trägt eine Anzahl an Ansatzfunktionen, die in globalen Koordinaten ausgedrückt werden. Diese Ansatzfunktionen werden mit einer Partition of Unity multipliziert und dienen als Elementansatzfunktionen für den Patch, der aus den angrenzenden Elementen des Knotens gebildet wird. Durch diese Art des Ansatzes wird die C0-Stetikeit für beliebige Ansatzfunktionen gewährleistet. Der Ansatz wird mit analytischen Funktionen und numerischen Näherungsrechnungen angereichert und enthält somit a-priori Wissen der Lösung in der Nähe von Singularitäten. Die Performance der GFEM mit Ansätzen höhere Ordnung wird mit klassischen h-, p- und hp-Diskretisierungen der FEM verglichen. Die meisten effizienten Löser, z.B. Multi Grid Verfahren oder die CG-Methode, können nicht für das semi-definite Gleichungssystem verwendet werden, dass aus der GFEM-Diskretisierung resultiert. Verschiedene Lösungsstrategien für GFEM-Diskretisierungen höhere Ordnungen werden untersucht. Die Arbeit schließt mit einer Beschreibung der Implementierung der Methode in Form eines Objekt-orientierten Frameworks in C++ ab

Universität der Bundeswehr München: AtheneForschung