Search CORE

203 research outputs found

Special Algorithm for Stability Analysis of Multistable Biological Regulatory Systems

Author: Hong Hoon
Tang Xiaoxian
Xia Bican
Publication venue
Publication date: 26/12/2013
Field of study

We consider the problem of counting (stable) equilibriums of an important family of algebraic differential equations modeling multistable biological regulatory systems. The problem can be solved, in principle, using real quantifier elimination algorithms, in particular real root classification algorithms. However, it is well known that they can handle only very small cases due to the enormous computing time requirements. In this paper, we present a special algorithm which is much more efficient than the general methods. Its efficiency comes from the exploitation of certain interesting structures of the family of differential equations.Comment: 24 pages, 5 algorithms, 10 figure

arXiv.org e-Print Archive

CiteSeerX

Modelling and Analysis for Cyber-Physical Systems: An SMT-based approach

Author: Dung Phan Anh
Publication venue: Technical University of Denmark
Publication date: 01/01/2015
Field of study

Online Research Database In Technology

The Polyhedral Model Beyond Loops Recursion Optimization and Parallelization Through Polyhedral Modeling

Author: Clauss Philippe
Kobeissi Salwa
Publication venue: HAL CCSD
Publication date: 23/01/2019
Field of study

International audienceThere may be a huge gap between the statements outlined by programmers in a program source code and instructions that are actually performed by a given processor architecture when running the executable code. This gap is due to the way the input code has been interpreted, translated and transformed by the compiler and the final processor hardware. Thus, there is an opportunity for efficient optimization strategies, that are dedicated to specific control structures and memory access patterns, to apply as soon as the actual runtime behavior has been discovered, even if they could not have been applied on the original source code. In this paper, we develop this idea by identifying code extracts that behave as polyhedral-compliant loops at runtime, while not having been outlined at all as loops in the original source code. In particular, we are interested in recursive functions whose runtime behavior can be modeled as polyhedral loops. Therefore, the scope of this study exclusively includes recursive functions whose control flow and memory accesses exhibit an affine behavior, which means that there exists a semantically equivalent affine loop nest, candidate for poly-hedral optimizations. Accordingly, our approach is based on analyzing early executions of a recursive program using a Nested Loop Recognition (NLR) algorithm, performing the affine loop modeling of the original program runtime behavior , which is then used to generate an equivalent iterative program, finally optimized using the polyhedral compiler Polly. We present some preliminary results showing that this approach brings recursion optimization techniques into a higher level in addition to widening the scope of the polyhe-dral model to include originally non-loop programs

INRIA a CCSD electronic archive server

Hal-Diderot

Towards Comprehensive Parametric Code Generation Targeting Graphics Processing Units in Support of Scientific Computation

Author: Xie Ning
Publication venue: Scholarship@Western
Publication date: 08/11/2016
Field of study

The most popular multithreaded languages based on the fork-join concurrency model (CIlkPlus, OpenMP) are currently being extended to support other forms of parallelism (vectorization, pipelining and single-instruction-multiple-data (SIMD)). In the SIMD case, the objective is to execute the corresponding code on a many-core device, like a GPGPU, for which the CUDA language is a natural choice. Since the programming concepts of CilkPlus and OpenMP are very different from those of CUDA, it is desirable to automatically generate optimized CUDA-like code from CilkPlus or OpenMP. In this thesis, we propose an accelerator model for annotated C/C++ code together with an implementation that allows the automatic generation of CUDA code. One of the key features of this CUDA code generator is that it supports the generation of CUDA kernel code where program parameters (like number of threads per block) and machine parameters (like shared memory size) are treated as unknown symbols. Hence, these parameters need not to be known at code-generation-time: machine parameters and program parameters can be respectively determined when the generated code is installed on the target machine. In addition, we show how these parametric CUDA programs can be optimized at compile-time in the form of a case discussion, where cases depend on the values of machine parameters (e.g. hardware resource limits) and program parameters (e.g. dimension sizes of thread-blocks). This generation of parametric CUDA kernels requires to deal with non-linear polynomial expressions during the dependence analysis and tiling phase. To achieve these algebraic calculations, we take advantage of techniques from computer algebra, in particular in the RegularChains library of Maple. Various illustrative examples are provided together with performance evaluation

Scholarship@Western

AutopaR: An Automatic Parallelization Tool for Recursive Calls

Author: Kalender M.E.
Mergenci C.
Ozturk O.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Manycore systems are becoming more and more powerful with the integration of hundreds of cores on a single chip. However, writing parallel programs on these manycore systems has become a problem since the amount of available parallel tools and applications are limited. Although exploiting parallelism in software is possible, it requires different design decisions, significant programmer effort and is error prone. Different libraries and tools try to make the transition to parallelism easier, however there is no concrete system to make it transparent to software developer. To this end, our proposed tool is a step forward to improve the current state. Our approach, Autopar, specifically aims at achieving automatic parallelization of recursive applications using static program analysis. It first decides on the recursive functions of a given program. Then, it performs analysis and collects information about these recursive functions. Our analysis module automatically collects program information without requiring any modification in the program design or developer involvement. Finally, it achieves automatic parallelization by introducing necessary OpenMP pragmas in appropriate places in the application. © 2014 IEEE

Bilkent University Institutional Repository

MetaFork: A Compilation Framework for Concurrency Models Targeting Hardware Accelerators

Author: Chen Xiaohui
Publication venue: Scholarship@Western
Publication date: 24/03/2017
Field of study

Parallel programming is gaining ground in various domains due to the tremendous computational power that it brings; however, it also requires a substantial code crafting effort to achieve performance improvement. Unfortunately, in most cases, performance tuning has to be accomplished manually by programmers. We argue that automated tuning is necessary due to the combination of the following factors. First, code optimization is machine-dependent. That is, optimization preferred on one machine may be not suitable for another machine. Second, as the possible optimization search space increases, manually finding an optimized configuration is hard. Therefore, developing new compiler techniques for optimizing applications is of considerable interest. This thesis aims at generating new techniques that will help programmers develop efficient algorithms and code targeting hardware acceleration technologies, in a more effective manner. Our work is organized around a compilation framework, called MetaFork, for concurrency platforms and its application to automatic parallelization. MetaFork is a high-level programming language extending C/C++, which combines several models of concurrency including fork-join, SIMD and pipelining parallelism. MetaFork is also a compilation framework which aims at facilitating the design and implementation of concurrent programs through four key features which make MetaFork unique and novel: (1) Perform automatic code translation between concurrency platforms targeting multi-core architectures. (2) Provide a high-level language for expressing concurrency as in the fork-join model, the SIMD paradigm and the pipelining parallelism. (3) Generate parallel code from serial code with an emphasis on code depending on machine or program parameters (e.g. cache size, number of processors, number of threads per thread block). (4) Optimize code depending on parameters that are unknown at compile-time

Scholarship@Western

Verified lifting of stencil computations

Author: Alur Rajeev
Alvin Cheung
Amarasinghe Saman P
Armando Solar-Lezama
Catanzaro Bryan
Heroux Michael A.
Jeon Jinseong
Kamil Shoaib
Karbyshev Aleksandr
Lezama Armando Solar
Mallinson A.C.
Plotkin Gordon D
Reynolds John C.
Shachar Itzhaky
Shoaib Kamil
Vasco Diego
Zaharia Matei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2016
Field of study

This paper demonstrates a novel combination of program synthesis and verification to lift stencil computations from low-level Fortran code to a high-level summary expressed using a predicate language. The technique is sound and mostly automated, and leverages counter-example guided inductive synthesis (CEGIS) to find provably correct translations. Lifting existing code to a high-performance description language has a number of benefits, including maintainability and performance portability. For example, our experiments show that the lifted summaries can enable domain specific compilers to do a better job of parallelization as compared to an off-the-shelf compiler working on the original code, and can even support fully automatic migration to hardware accelerators such as GPUs. We have implemented verified lifting in a system called STNG and have evaluated it using microbenchmarks, mini-apps, and real-world applications. We demonstrate the benefits of verified lifting by first automatically summarizing Fortran source code into a high-level predicate language, and subsequently translating the lifted summaries into Halide, with the translated code achieving median performance speedups of 4.1X and up to 24X for non-trivial stencils as compared to the original implementation.United States. Department of Energy. Office of Science (Award DE-SC0008923)United States. Department of Energy. Office of Science (Award DE-SC0005288

DSpace@MIT

Crossref