21 research outputs found

    Loop Parallelization using Dynamic Commutativity Analysis

    Get PDF

    FADAlib: an open source C++ library for fuzzy array dataflow analysis

    Get PDF
    AbstractUbiquitous multicore architectures require that many levels of parallelism have to be found in codes. Dependence analysis is the main approach in compilers for the detection of parallelism. It enables vectorisation and automatic parallelisation, among many other optimising transformations, and is therefore of crucial importance for optimising compilers.This paper presents new open source software, FADAlib, performing an instance-wise dataflow analysis for scalar and array references. The software is a C++ implementation of the Fuzzy Array Dataflow Analysis (FADA) method. This method can be applied on codes with irregular control such as while-loops, if-then-else or non-regular array accesses, and computes exact instance-wise dataflow analysis on regular codes. As far as we know, FADAlib is the first released open source C++ implementation of instance-wise data flow dependence handling larger classes of programs. In addition, the library is technically independent from an existing compiler; It can be plugged in many of them; this article shows an example of a successful integration inside gcc/GRAPHITE. We give details concerning the library implementation and then report some initial results with gcc and possible use for trace scheduling on irregular codes

    Compiling Recurrences over Dense and Sparse Arrays

    Full text link
    Recurrence equations lie at the heart of many computational paradigms including dynamic programming, graph analysis, and linear solvers. These equations are often expensive to compute and much work has gone into optimizing them for different situations. The set of recurrence implementations is a large design space across the set of all recurrences (e.g., the Viterbi and Floyd-Warshall algorithms), the choice of data structures (e.g., dense and sparse matrices), and the set of different loop orders. Optimized library implementations do not exist for most points in this design space, and developers must therefore often manually implement and optimize recurrences. We present a general framework for compiling recurrence equations into native code corresponding to any valid point in this general design space. In this framework, users specify a system of recurrences, the type of data structures for storing the input and outputs, and a set of scheduling primitives for optimization. A greedy algorithm then takes this specification and lowers it into a native program that respects the dependencies inherent to the recurrence equation. We describe the compiler transformations necessary to lower this high-level specification into native parallel code for either sparse and dense data structures and provide an algorithm for determining whether the recurrence system is solvable with the provided scheduling primitives. We evaluate the performance and correctness of the generated code on various computational tasks from domains including dense and sparse matrix solvers, dynamic programming, graph problems, and sparse tensor algebra. We demonstrate that generated code has competitive performance to handwritten implementations in libraries

    Language support for dynamic, hierarchical data partitioning

    Full text link

    A novel numerical framework for simulation of multiscale spatio-temporally non-linear systems in additive manufacturing processes.

    Get PDF
    New computationally efficient numerical techniques have been formulated for multi-scale analysis in order to bridge mesoscopic and macroscopic scales of thermal and mechanical responses of a material. These numerical techniques will reduce computational efforts required to simulate metal based Additive Manufacturing (AM) processes. Considering the availability of physics based constitutive models for response at mesoscopic scales, these techniques will help in the evaluation of the thermal response and mechanical properties during layer-by-layer processing in AM. Two classes of numerical techniques have been explored. The first class of numerical techniques has been developed for evaluating the periodic spatiotemporal thermal response involving multiple time and spatial scales at the continuum level. The second class of numerical techniques is targeted at modeling multi-scale multi-energy dissipative phenomena during the solid state Ultrasonic Consolidation process. This includes bridging the mesoscopic response of a crystal plasticity finite element framework at inter- and intragranular scales and a point at the macroscopic scale. This response has been used to develop an energy dissipative constitutive model for a multi-surface interface at the macroscopic scale. An adaptive dynamic meshing strategy as a part of first class of numerical techniques has been developed which reduces computational cost by efficient node element renumbering and assembly of stiffness matrices. This strategy has been able to reduce the computational cost for solving thermal simulation of Selective Laser Melting process by ~100 times. This method is not limited to SLM processes and can be extended to any other fusion based additive manufacturing process and more generally to any moving energy source finite element problem. Novel FEM based beam theories have been formulated which are more general in nature compared to traditional beam theories for solid deformation. These theories have been the first to simulate thermal problems similar to a solid beam analysis approach. These are more general in nature and are capable of simulating general cross-section beams with an ability to match results for complete three dimensional analysis. In addition to this, a traditional Cholesky decomposition algorithm has been modified to reduce the computational cost of solving simultaneous equations involved in FEM simulations. Solid state processes have been simulated with crystal plasticity based nonlinear finite element algorithms. This algorithm has been further sped up by introduction of an interfacial contact constitutive model formulation. This framework has been supported by a novel methodology to solve contact problems without additional computational overhead to incorporate constraint equations averting the usage of penalty springs

    Towards Semantic Clone Detection, Benchmarking, and Evaluation

    Get PDF
    Developers copy and paste their code to speed up the development process. Sometimes, they copy code from other systems or look up code online to solve a complex problem. Developers reuse copied code with or without modifications. The resulting similar or identical code fragments are called code clones. Sometimes clones are unintentionally written when a developer implements the same or similar functionality. Even when the resulting code fragments are not textually similar but implement the same functionality they are still considered to be clones and are classified as semantic clones. Semantic clones are defined as code fragments that perform the exact same computation and are implemented using different syntax. Software cloning research indicates that code clones exist in all software systems; on average, 5% to 20% of software code is cloned. Due to the potential impact of clones, whether positive or negative, it is essential to locate, track, and manage clones in the source code. Considerable research has been conducted on all types of code clones, including clone detection, analysis, management, and evaluation. Despite the great interest in code clones, there has been considerably less work conducted on semantic clones. As described in this thesis, I advance the state-of-the-art in semantic clone research in several ways. First, I conducted an empirical study to investigate the status of code cloning in and across open-source game systems and the effectiveness of different normalization, filtering, and transformation techniques for detecting semantic clones. Second, I developed an approach to detect clones across .NET programming languages using an intermediate language. Third, I developed a technique using an intermediate language and an ontology to detect semantic clones. Fourth, I mined Stack Overflow answers to build a semantic code clone benchmark that represents real semantic code clones in four programming languages, C, C#, Java, and Python. Fifth, I defined a comprehensive taxonomy that identifies semantic clone types. Finally, I implemented an injection framework that uses the benchmark to compare and evaluate semantic code clone detectors by automatically measuring recall

    Code Clone Discovery Based on Concolic Analysis

    Get PDF
    Software is often large, complicated and expensive to build and maintain. Redundant code can make these applications even more costly and difficult to maintain. Duplicated code is often introduced into these systems for a variety of reasons. Some of which include developer churn, deficient developer application comprehension and lack of adherence to proper development practices. Code redundancy has several adverse effects on a software application including an increased size of the codebase and inconsistent developer changes due to elevated program comprehension needs. A code clone is defined as multiple code fragments that produce similar results when given the same input. There are generally four types of clones that are recognized. They range from simple type-1 and 2 clones, to the more complicated type-3 and 4 clones. Numerous clone detection mechanisms are able to identify the simpler types of code clone candidates, but far fewer claim the ability to find the more difficult type-3 clones. Before CCCD, MeCC and FCD were the only clone detection techniques capable of finding type-4 clones. A drawback of MeCC is the excessive time required to detect clones and the likely exploration of an unreasonably large number of possible paths. FCD requires extensive amounts of random data and a significant period of time in order to discover clones. This dissertation presents a new process for discovering code clones known as Concolic Code Clone Discovery (CCCD). This technique discovers code clone candidates based on the functionality of the application, not its syntactical nature. This means that things like naming conventions and comments in the source code have no effect on the proposed clone detection process. CCCD finds clones by first performing concolic analysis on the targeted source code. Concolic analysis combines concrete and symbolic execution in order to traverse all possible paths of the targeted program. These paths are represented by the generated concolic output. A diff tool is then used to determine if the concolic output for a method is identical to the output produced for another method. Duplicated output is indicative of a code clone. CCCD was validated against several open source applications along with clones of all four types as defined by previous research. The results demonstrate that CCCD was able to detect all types of clone candidates with a high level of accuracy. In the future, CCCD will be used to examine how software developers work with type-3 and type-4 clones. CCCD will also be applied to various areas of security research, including intrusion detection mechanisms

    Reconnaissance d'opérations d'algèbre linéaire dans un programme polyédrique

    Get PDF
    Writing a code which uses an architecture at its full capability has become an increasingly difficult problem over the last years. For some key operations, a dedicated accelerator or a finely tuned implementation exists and delivers the best performance. Thus, when compiling a code, identifying these operations and issuing calls to their high-performance implementation is attractive. In this dissertation, we focus on the problem of detection of these operations. We propose a framework which detects linear algebra subcomputations within a polyhedral program. The main idea of this framework is to partition the computation in order to isolate different subcomputations in a regular manner, then we consider each portion of the computation and try to recognize it as a combination of linear algebra operations.We perform the partitioning of the computation by using a program transformation called monoparametric tiling. This transformation partitions the computation into blocks, whose shape is some homothetic scaling of a fixed-size partitioning. We show that the tiled program remains polyhedral while allowing a limited amount of parametrization: a single size parameter. This is an improvement compared to the previous work on tiling, that forced us to choose between these two properties.Then, in order to recognize computations, we introduce a template recognition algorithm. This template recognition algorithm is built on a state-of-the-art program equivalence algorithm. We also propose several extensions in order to manage some semantic properties.Finally, we combine these two previous contributions into a framework which detects linear algebra subcomputations. A part of this framework is a library of template, based on the BLAS specification. We demonstrate our framework on several applications.Durant ces dernières années, Il est de plus en plus compliqué d'écrire du code qui utilise une architecture au mieux de ses capacités. Certaines opérations clefs ont soit un accélérateur dédié, ou admettent une implémentation finement optimisée qui délivre les meilleurs performances. Ainsi, il est intéressant d'identifier ces opérations pendant la compilation d'un programme, et de faire appel à une implémentation optimisée.Nous nous intéressons dans cette thèse au problème de détection de ces opérations. Nous proposons un procédé qui détecte des sous-calculs correspondant à des opérations d'algèbre linéaire à l'intérieur de programmes polyédriques. L'idée principale de ce procédé est de découper le programme en sous-calculs isolés, et essayer de reconnaître chaque sous-calculs comme une combinaison d'opérateurs d'algèbre linéaire.Le découpage du calcul est effectué en utilisant une transformation de programme appelée tuilage monoparamétrique. Cette transformation partitionne le calcul en tuiles dont la forme est un agrandissement paramétrique d'une tuile de taille constante. Nous montrons que le programme tuilé reste polyédrique tout en permettant une paramétrisation limitée des tailles de tuile. Les travaux précédents sur le tuilage nous forçaient à choisir l'une de ces deux propriétés.Ensuite, afin d'identifier les opérateurs, nous introduisons un algorithme de reconnaissance de template, qui est une extension d'un algorithme d'équivalence de programme. Nous proposons plusieurs extensions afin de tenir compte des propriétés sémantiques communément rencontrées en algèbre linéaire.Enfin, nous combinons les deux contributions précédentes en un procédé qui détecte les sous-calculs correspondant à des opérateurs d'algèbre linéaire. Une de ses composantes est une librairie de template, inspirée de la spécification BLAS. Nous démontrons l'efficacité de notre procédé sur plusieurs applications
    corecore