Search CORE

49 research outputs found

Space-Efficient Parallel Algorithms for Combinatorial Search Problems

Author: C. Kaklamanis
J.S. Vitter
K.T. Herley
K.T. Herley
K.T. Herley
L. Goldberg
R.M. Karp
Publication venue
Publication date: 01/01/2013
Field of study

We present space-efficient parallel strategies for two fundamental combinatorial search problems, namely, backtrack search and branch-and-bound, both involving the visit of an

n

-node tree of height

h

under the assumption that a node can be accessed only through its father or its children. For both problems we propose efficient algorithms that run on a

p

-processor distributed-memory machine. For backtrack search, we give a deterministic algorithm running in

O(n/p+h\log p)

time, and a Las Vegas algorithm requiring optimal

O(n/p+h)

time, with high probability. Building on the backtrack search algorithm, we also derive a Las Vegas algorithm for branch-and-bound which runs in

O((n/p+h\log p \log n)h\log^2 n)

time, with high probability. A remarkable feature of our algorithms is the use of only constant space per processor, which constitutes a significant improvement upon previous algorithms whose space requirements per processor depend on the (possibly huge) tree to be explored.Comment: Extended version of the paper in the Proc. of 38th International Symposium on Mathematical Foundations of Computer Science (MFCS

arXiv.org e-Print Archive

Crossref

University of Southern Denmark Research Output

Archivio istituzionale della ricerca - Università di Padova

Parallel path consistency

Author: Henderson Thomas C.
Susswein Steven Y.
Publication venue: University of Utah
Publication date: 01/01/1991
Field of study

Journal ArticleFiltering algorithms are well accepted as a means of speeding up the solution of the consistent labeling problem (CLP). Despite the fact that path consistency does a better job of filtering than arc consistency, AC is still the preferred technique because it has a much lower time complexity. We are implementing parallel path consistency algorithms on multiprocessors and comparing their performance to the best sequential and parallel arc consistency algorithms. We also intend to categorize the relation between graph structure and algorithm performance. Preliminary work has shown linear performance increases for parallelized path consistency and also shown that in many cases performance is significantly better than the theoretical worst case. These two results lead us to believe that parallel path consistency may be a superior filtering technique, finally, we have explored the use of an outer product computational formation of path consistency and have excellent results of its use on a Connection Machine

The University of Utah: J. Willard Marriott Digital Library

Recommended from our members

An Empirical Study of Dynamic Scheduling on Rings of Processors

Author: Cohen Paul R.
Gao Lixin
Gregory Dawn E.
Rosenberg Arnold L.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/1996
Field of study

The authors empirically analyze and compare two distributed low-overhead policies for scheduling dynamic tree-structured computations on rings of identical PEs. The experiments show that both policies give significant parallel speedup on large classes of computations, and that one yields almost optimal speedup on moderate size rings. They believe that the methodology of experiment design and analysis will prove useful in other such studies

ScholarWorks@UMass Amherst

Parallel processing and expert systems

Author: Lau Sonie
Yan Jerry C.
Publication venue
Publication date
Field of study

Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited

NASA Technical Reports Server

Three Highly Parallel Computer Architectures and Their Suitability for Three Representative Artificial Intelligence Problems

Author: Katriel Ron
Publication venue: ScholarlyCommons
Publication date: 28/09/1987
Field of study

Virtually all current Artificial Intelligence (AI) applications are designed to run on sequential (von Neumann) computer architectures. As a result, current systems do not scale up. As knowledge is added to these systems, a point is reached where their performance quickly degrades. The performance of a von Neumann machine is limited by the bandwidth between memory and processor (the von Neumann bottleneck). The bottleneck is avoided by distributing the processing power across the memory of the computer. In this scheme the memory becomes the processor (a smart memory ). This paper highlights the relationship between three representative AI application domains, namely knowledge representation, rule-based expert systems, and vision, and their parallel hardware realizations. Three machines, covering a wide range of fundamental properties of parallel processors, namely module granularity, concurrency control, and communication geometry, are reviewed: the Connection Machine (a fine-grained SIMD hypercube), DADO (a medium-grained MIMD/SIMD/MSIMD tree-machine), and the Butterfly (a coarse-grained MIMD Butterflyswitch machine)

ScholarlyCommons@Penn

Performance of arc consistency algorithms on the CRAY

Author: Henderson Thomas C.
Samal Ashok
Publication venue: University of Utah
Publication date: 01/01/1987
Field of study

Journal ArticleThe consistent labeling problem arises in high level computer vision when assigning semantic meaning to the regions of a n image. One of the drawbacks of this method is that it is rather slow. By using the consistency tests, node, arc and path consistency [9], the search space is drastically reduced. However, for large problems it takes a fair amount of time. To use these algorithms more efficiently, one can take two approaches. First, is to design special purpose hardware to specifically run these algorithms. Second is t o use faster computers. Here again, one can either take advantage of the multiprocessors, which are becoming very widely available, or use supercomputers like the CRAY, CDC, etc. Here, we present results of the performance of these algorithms in the CRAY supercomputer

The University of Utah: J. Willard Marriott Digital Library

TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Training

Author: Durg Ajaya
Elavazhagan Midhilesh
Gupta Swati
Krishna Tushar
Srinivasan Sudarshan
Won William
Publication venue
Publication date: 11/04/2023
Field of study

Collective communications are an indispensable part of distributed training. Running a topology-aware collective algorithm is crucial for optimizing communication performance by minimizing congestion. Today such algorithms only exist for a small set of simple topologies, limiting the topologies employed in training clusters and handling irregular topologies due to network failures. In this paper, we propose TACOS, an automated topology-aware collective synthesizer for arbitrary input network topologies. TACOS synthesized 3.73x faster All-Reduce algorithm over baselines, and synthesized collective algorithms for 512-NPU system in just 6.1 minutes

arXiv.org e-Print Archive

A Parallel Computational Approach for String Matching- A Novel Structure with Omega Model

Author: K Butchi Raju
Publication venue: Global Journals Inc. (US)
Publication date: 28/02/2013
Field of study

In r e cent day2019;s parallel string matching problem catch the attention of so many researchers because of the importance in different applications like IRS, Genome sequence, data cleaning etc.,. While it is very easily stated and many of the simple algorithms perform very well in practice, numerous works have been published on the subject and research is still very active. In this paper we propose a omega parallel computing model for parallel string matching. The algorithm is designed to work on omega model pa rallel architecture where text is divided for parallel processing and special searching at division point is required for consistent and complete searching. This algorithm reduces the number of comparisons and parallelization improves the time efficiency. Experimental results show that, on a multi - processor system, the omega model implementation of the proposed parallel string matching algorithm can reduce string matching time

Global Journal of Computer Science and Technology (GJCST)

Recommended from our members

Working notes of the 1991 spring symposium on constraint-based reasoning

Author: Dechter Rina
Publication venue: eScholarship, University of California
Publication date: 16/09/1991
Field of study

eScholarship - University of California

The Use of Parallel Processing in VLSI Computer-Aided Design Application

Author: Banerjee Prith
Publication venue: Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Publication date: 01/05/1989
Field of study

Coordinated Science Laboratory was formerly known as Control Systems LaboratorySemiconductor Research Corporation / 87-DP-10

Illinois Digital Environment for Access to Learning and Scholarship Repository