Search CORE

12 research outputs found

Clock Math — a System for Solving SLEs Exactly

Author: Hladík Jakub
Lórencz Róbert
Šimeček Ivan
Publication venue: 'Czech Technical University in Prague - Central Library'
Publication date: 02/01/2013
Field of study

In this paper, we present a GPU-accelerated hybrid system that solves ill-conditioned systems of linear equations exactly. Exactly means without rounding errors due to using integer arithmetics. First, we scale floating-point numbers up to integers, then we solve dozens of SLEs within different modular arithmetics and then we assemble sub-solutions back using the Chinese remainder theorem. This approach effectively bypasses current CPU floating-point limitations. The system is capable of solving Hilbert’s matrix without losing a single bit of precision, and with a significant speedup compared to existing CPU solvers

Directory of Open Access Journals

CTU Open Journal Systems (Czech Technical University, Prague / České vysoké učení technické v Praze)

Parallel Solver of Large Systems of Linear Inequalities Using Fourier-Motzkin Elimination

Author: Fritsch Richard
Langr Daniel
Lórencz Róbert
Šimeček Ivan
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 10/02/2017
Field of study

Fourier-Motzkin elimination is a computationally expensive but powerful method to solve a system of linear inequalities. These systems arise e.g. in execution order analysis for loop nests or in integer linear programming. This paper focuses on the analysis, design and implementation of a parallel solver for distributed memory for large systems of linear inequalities using the Fourier-Motzkin elimination algorithm. We also measure the speedup of parallel solver and prove that this implementation results in good scalability

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Block Iterators for Sparse Matrices

Author: Daniel Langr
Ivan Šimeček
Tomáš Dytrych
Publication venue: 'Polish Information Processing Society PTI'
Publication date: 01/10/2016
Field of study

Crossref

Directory of Open Access Journals

A New Format for the Sparse Matrix-vector Multiplication

Author: Ivan Šimeček
Publication venue
Publication date
Field of study

Algorithms for the sparse matrix-vector multiplication (shortly SpMV) are important building blocks in solvers of sparse systems of linear equations. Due to matrix sparsity, the memory access patterns are irregular and the utilization of a cache suffers from low spatial and temporal locality. To reduce this effect, the register blocking formats were designed. This paper introduces a new combined format, for storing sparse matrices that extends possibilities of the diagonal register blocking format

CiteSeerX

A New Approach for Accelerating the Sparse Matrixvector Multiplication

Author: Ivan Šimeček
Publication venue
Publication date
Field of study

Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerical linear algebra. The problem is that the memory access patterns during the SpMV are irregular and the utilization of cache can suffer from low spatial or temporal locality. Approaches to improve the performance of SpMV are based on matrix reordering and register blocking. These matrix transformations are designed to handle randomly occurring dense blocks in a sparse matrix. The efficiency of these transformations depends strongly on the presence of suitable blocks. The overhead of a reorganization of a matrix from one format to another one is often of the order of tens of executions of a SpMV. For that reason, such a reorganization pays off only if the same matrix A is multiplied with multiple different vectors, e.g., in iterative linear solvers. This paper introduces new approach for the acceleration the SpMV. This approach consists of 3 steps: 1) dividing matrix A into non-empty regions, 2) choosing an efficient way to traverse these regions (in another words choosing an efficient ordering of partial multiplications), 3) choosing the optimal type of storage for each region. All these 3 steps are tightly coupled. The first step divides the whole matrix into smaller parts (regions) those can fit in the cache. The second step improve

CiteSeerX

Acceleration of Le Bail fitting method on parallel platforms

Author: Mařík Ondřej
Šimeček Ivan
Publication venue: Institute of Mathematics AS CR
Publication date: 01/01/2014
Field of study

summary:Le Bail fitting method is procedure used in the applied crystallography mainly during the crystal structure determination. As in many other applications, there is a need for a great performance and short execution time. In this paper, we describe utilization of parallel computing for mathematical operations used in Le Bail fitting. We present an algorithm implementing this method with highlighted possible approaches to its aforementioned parallelization. Then, we propose a sample parallel version using the OpenMP API and its performance results on the real multithreaded system. Further potential for the massive parallelization is also discussed

Institute of Mathematics AS CR, v. v. i.

Efficient parallel evaluation of block properties of sparse matrices

Author: Daniel Langr
Ivan Šimeček
Publication venue: 'Polish Information Processing Society PTI'
Publication date: 01/10/2016
Field of study

Crossref

Directory of Open Access Journals

A new diagonal blocking format and model of cache behavior for sparse matrices

Author: Ivan Šimeček
Pavel Tvrdík
Publication venue
Publication date
Field of study

Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks in solvers of sparse systems of linear equations. Due to matrix sparsity, the memory access patterns are irregular and the utilization of a cache suffers from low spatial and\ud temporal locality. To reduce this effect, the diagonal register blocking format was designed. This paper introduces a new combined format, called\ud CARB, for storing sparse matrices that extends possibilities of the diagonal register blocking format.\ud \ud We have also developed a probabilistic model for estimating the numbers of cache misses during the SpMxV in the CARB format. Using HW cache monitoring tools, we compare the predicted numbers of cache misses with real numbers on Intel x86 architecture with L1 and L2 caches. The average accuracy of our analytical model is around 95% in case of\ud L2 cache and 88% in case of L1 cache

CiteSeerX

Parallelization of artificial immune systems using a massive parallel approach via modern GPUs

Author: Khun Jiří
Šimeček Ivan
Publication venue: Institute of Mathematics AS CR
Publication date: 01/01/2014
Field of study

summary:Parallelization is one of possible approaches for obtaining better results in terms of algorithm performance and overcome the limits of the sequential computation. In this paper, we present a study of parallelization of the opt-aiNet algorithm which comes from Artificial Immune Systems, one part of large family of population based algorithms inspired by nature. The opt-aiNet algorithm is based on an immune network theory which incorporates knowledge about mammalian immune systems in order to create a state-of-the-art algorithm suitable for the multimodal function optimization. The algorithm is known for a combination of local and global search with an emphasis on maintaining a stable set of distinct local extrema solutions. Moreover, its modifications can be used for many other purposes like data clustering or combinatorial optimization. The parallel version of the algorithm is designed especially for modern graphics processing units. The preliminary performance results show very significant speedup over the computation with traditional central processor units

Institute of Mathematics AS CR, v. v. i.