260 research outputs found
Recommended from our members
Solving large scale linear programming
The interior point method (IPM) is now well established as a competitive technique for solving very large scale linear programming problems. The leading variant of the interior point method is the primal dual - predictor corrector algorithm due to Mehrotra. The main computational steps of this algorithm are the repeated calculation and solution of a large sparse positive definite system of equations.
We describe an implementation of the predictor corrector IPM algorithm on MasPar, a massively parallel SIMD computer. At the heart of the implemen-tation is a parallel Cholesky factorization algorithm for sparse matrices. Our implementation uses a new scheme of mapping the matrix onto the processor grid of the MasPar, that results in a more efficient Cholesky factorization than previously suggested schemes.
The IPM implementation uses the parallel unit of MasPar to speed up the factorization and other computationally intensive parts of the IPM. An impor-tant part of this implementation is the judicious division of data and computation between the front-end computer, that runs the main IPM algorithm, and the par-allel unit. Performanc
A Hybrid Multi-GPU Implementation of Simplex Algorithm with CPU Collaboration
The simplex algorithm has been successfully used for many years in solving
linear programming (LP) problems. Due to the intensive computations required
(especially for the solution of large LP problems), parallel approaches have
also extensively been studied. The computational power provided by the modern
GPUs as well as the rapid development of multicore CPU systems have led OpenMP
and CUDA programming models to the top preferences during the last years.
However, the desired efficient collaboration between CPU and GPU through the
combined use of the above programming models is still considered a hard
research problem. In the above context, we demonstrate here an excessively
efficient implementation of standard simplex, targeting to the best possible
exploitation of the concurrent use of all the computing resources, on a
multicore platform with multiple CUDA-enabled GPUs. More concretely, we present
a novel hybrid collaboration scheme which is based on the concurrent execution
of suitably spread CPU-assigned (via multithreading) and GPU-offloaded
computations. The experimental results extracted through the cooperative use of
OpenMP and CUDA over a notably powerful modern hybrid platform (consisting of
32 cores and two high-spec GPUs, Titan Rtx and Rtx 2080Ti) highlight that the
performance of the presented here hybrid GPU/CPU collaboration scheme is
clearly superior to the GPU-only implementation under almost all conditions.
The corresponding measurements validate the value of using all resources
concurrently, even in the case of a multi-GPU configuration platform.
Furthermore, the given implementations are completely comparable (and slightly
superior in most cases) to other related attempts in the bibliography, and
clearly superior to the native CPU-implementation with 32 cores.Comment: 12 page
Advances in design and implementation of optimization software
Developing optimization software that is capable of solving large and complex real-life problems is a huge effort. It is based on a deep knowledge of four areas: theory of optimization algorithms, relevant results of computer science, principles of software engineering, and computer technology. The paper highlights the diverse requirements of optimization software and introduces the ingredients needed to fulfill them. After a review of the hardware/software environment it gives a survey of computationally successful techniques for continuous optimization. It also outlines the perspective offered by parallel computing, and stresses the importance of optimization modeling systems. The inclusion of many references is intended to both give due credit to results in the field of optimization software and help readers obtain more detailed information on issues of interest
Modern Optimization Algorithms and Applications: Architectural Layout Generation and Parallel Linear Programming
This thesis examines two topics from the field of computational optimization; architectural layout generation and parallel linear programming. The first topic, a modern problem in heuristic optimization, focuses on deriving a general form of the optimization problem and solving it with the proposed Evolutionary Treemap algorithm. Tests of the algorithm\u27s implementation within a highly scalable web application developed with Scala and the web service framework Play reveal the algorithm is effective at generated layouts in multiple styles. The second topic, a classical problem in operations research, focuses on methodologies for implementing the Simplex Algorithm on a parallel computer for solving large-scale linear programming problems. Implementations of the algorithm\u27s data-parallel and task parallel forms illuminate the ideal method for accelerating a solver. The proposed Multi-Path Simplex Algorithm shows an average speed up of over two times that of a popular open-source solver, showing it is an effective methodology for solving linear programming problems
A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications
Particle swarm optimization (PSO) is a heuristic global optimization method, proposed originally by Kennedy and Eberhart in 1995. It is now one of the most commonly used optimization techniques. This survey presented a comprehensive investigation of PSO. On one hand, we provided advances with PSO, including its modifications (including quantum-behaved PSO, bare-bones PSO, chaotic PSO, and fuzzy PSO), population topology (as fully connected, von Neumann, ring, star, random, etc.), hybridization (with genetic algorithm, simulated annealing, Tabu search, artificial immune system, ant colony algorithm, artificial bee colony, differential evolution, harmonic search, and biogeography-based optimization), extensions (to multiobjective, constrained, discrete, and binary optimization), theoretical analysis (parameter selection and tuning, and convergence analysis), and parallel implementation (in multicore, multiprocessor, GPU, and cloud computing forms). On the other hand, we offered a survey on applications of PSO to the following eight fields: electrical and electronic engineering, automation control systems, communication theory, operations research, mechanical engineering, fuel and energy, medicine, chemistry, and biology. It is hoped that this survey would be beneficial for the researchers studying PSO algorithms
A sparse-grid isogeometric solver
Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS
as a basis for the approximation of the solution of PDEs. In this work, we
investigate to which extent IGA solvers can benefit from the so-called
sparse-grids construction in its combination technique form, which was first
introduced in the early 90s in the context of the approximation of
high-dimensional PDEs. The tests that we report show that, in accordance to the
literature, a sparse-grid construction can indeed be useful if the solution of
the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the
case of non-smooth solutions when some a-priori knowledge on the location of
the singularities of the solution can be exploited to devise suitable
non-equispaced meshes. Finally, we remark that sparse grids can be seen as a
simple way to parallelize pre-existing serial IGA solvers in a straightforward
fashion, which can be beneficial in many practical situations.Comment: updated version after revie
A multiprocessor computer simulation model employing a feedback scheduler/allocator for memory space and bandwidth matching and TMR processing
A computer simulation model for a multiprocessor computer is developed that is useful for studying the problem of matching multiprocessor's memory space, memory bandwidth and numbers and speeds of processors with aggregate job set characteristics. The model assumes an input work load of a set of recurrent jobs. The model includes a feedback scheduler/allocator which attempts to improve system performance through higher memory bandwidth utilization by matching individual job requirements for space and bandwidth with space availability and estimates of bandwidth availability at the times of memory allocation. The simulation model includes provisions for specifying precedence relations among the jobs in a job set, and provisions for specifying precedence execution of TMR (Triple Modular Redundant and SIMPLEX (non redundant) jobs
Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing
Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM
- …