1,594 research outputs found

    Alternating-Direction Line-Relaxation Methods on Multicomputers

    Get PDF
    We study the multicom.puter performance of a three-dimensional Navier–Stokes solver based on alternating-direction line-relaxation methods. We compare several multicomputer implementations, each of which combines a particular line-relaxation method and a particular distributed block-tridiagonal solver. In our experiments, the problem size was determined by resolution requirements of the application. As a result, the granularity of the computations of our study is finer than is customary in the performance analysis of concurrent block-tridiagonal solvers. Our best results were obtained with a modified half-Gauss–Seidel line-relaxation method implemented by means of a new iterative block-tridiagonal solver that is developed here. Most computations were performed on the Intel Touchstone Delta, but we also used the Intel Paragon XP/S, the Parsytec SC-256, and the Fujitsu S-600 for comparison

    Cube&Conquer-inspired Malleable Distributed SAT Solving

    Get PDF

    On the van der Waerden numbers w(2;3,t)

    Get PDF
    We present results and conjectures on the van der Waerden numbers w(2;3,t) and on the new palindromic van der Waerden numbers pdw(2;3,t). We have computed the new number w(2;3,19) = 349, and we provide lower bounds for 20 <= t <= 39, where for t <= 30 we conjecture these lower bounds to be exact. The lower bounds for 24 <= t <= 30 refute the conjecture that w(2;3,t) <= t^2, and we present an improved conjecture. We also investigate regularities in the good partitions (certificates) to better understand the lower bounds. Motivated by such reglarities, we introduce *palindromic van der Waerden numbers* pdw(k; t_0,...,t_{k-1}), defined as ordinary van der Waerden numbers w(k; t_0,...,t_{k-1}), however only allowing palindromic solutions (good partitions), defined as reading the same from both ends. Different from the situation for ordinary van der Waerden numbers, these "numbers" need actually to be pairs of numbers. We compute pdw(2;3,t) for 3 <= t <= 27, and we provide lower bounds, which we conjecture to be exact, for t <= 35. All computations are based on SAT solving, and we discuss the various relations between SAT solving and Ramsey theory. Especially we introduce a novel (open-source) SAT solver, the tawSolver, which performs best on the SAT instances studied here, and which is actually the original DLL-solver, but with an efficient implementation and a modern heuristic typical for look-ahead solvers (applying the theory developed in the SAT handbook article of the second author).Comment: Second version 25 pages, updates of numerical data, improved formulations, and extended discussions on SAT. Third version 42 pages, with SAT solver data (especially for new SAT solver) and improved representation. Fourth version 47 pages, with updates and added explanation

    Parallelizing a SAT-Based Product Configurator

    Get PDF
    This paper presents how state-of-the-art parallel algorithms designed to solve the Satisfiability (SAT) problem can be applied in the domain of product configuration. During an interactive configuration process, a user selects features step-by-step to find a suitable configuration that fulfills his desires and the set of product constraints. A configuration system can be used to guide the user through the process by validating the selections and providing feedback. Each validation of a user selection is formulated as a SAT problem. Furthermore, an optimization problem is identified to find solutions with the minimum amount of changes compared to the previous configuration. Another additional constraint is deterministic computation, which is not trivial to achieve in well performing parallel SAT solvers. In the paper we propose five new deterministic parallel algorithms and experimentally compare them. Experiments show that reasonable speedups are achieved by using multiple threads over the sequential counterpart

    Parallel processing and expert systems

    Get PDF
    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited

    Implementing a tool for designing portable parallel programs

    Get PDF
    The Implementation aspects of a novel parallel programming model called Cluster-M is presented in this thesis. This model provides an environment for efficiently designing highly parallel portable software. The two main components of this model are Cluster-M Specifications and Cluster-M Representations. A Cluster-M Specification consists of a number of clustering levels emphasizing computation and communication requirements of a parallel solution to a given problem. A Cluster-M Representation on the other hand, represents a multi-layered partitioning of a system graph corresponding to the topology of the target architecture. A set of basic constructs essential for writing Cluster-M Specifications using PCN are presented. Also, a. C program for generating the Cluster-M Representations is shown. Cluster-M Specifications are to be mapped onto the Representations using a proposed mapping methodology. Using Cluster-M a single software can be ported among various parallel computing systems. This thesis concentrates on the implementation of the Specifications and the Representations

    Correcting soft errors online in fast fourier transform

    Get PDF
    While many algorithm-based fault tolerance (ABFT) schemes have been proposed to detect soft errors offline in the fast Fourier transform (FFT) after computation finishes, none of the existing ABFT schemes detect soft errors online before the computation finishes. This paper presents an online ABFT scheme for FFT so that soft errors can be detected online and the corrupted computation can be terminated in a much more timely manner. We also extend our scheme to tolerate both arithmetic errors and memory errors, develop strategies to reduce its fault tolerance overhead and improve its numerical stability and fault coverage, and finally incorporate it into the widely used FFTW library - one of the today's fastest FFT software implementations. Experimental results demonstrate that: (1) the proposed online ABFT scheme introduces much lower overhead than the existing offline ABFT schemes; (2) it detects errors in a much more timely manner; and (3) it also has higher numerical stability and better fault coverage
    • …
    corecore