813 research outputs found

    mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis

    Get PDF
    MLIR is an emerging compiler infrastructure for modern hardware, but existing programs cannot take advantage of MLIR’s high-performance compilation if they are described in lower-level general purpose languages. Consequently, to avoid programs needing to be rewritten manually, this has led to efforts to automatically raise lower-level to higher-level dialects in MLIR. However, current methods rely on manually-defined raising rules, which limit their applicability and make them challenging to maintain as MLIR dialects evolve. We present mlirSynth – a novel approach which translates programs from lower-level MLIR dialects to high-level ones without manually defined rules. Instead, it uses available dialect definitions to construct a program space and searches it effectively using type constraints and equivalences. We demonstrate its effectiveness by raising C programs to two distinct high-level MLIR dialects, which enables us to use existing high-level dialect specific compilation flows. On Polybench, we show a greater coverage than previous approaches, resulting in geomean speedups of 2.5x (Intel) and 3.4x (AMD) over state-of-the-art compilation flows. mlirSynth also enables retargetability to domain-specific accelerators, resulting in a geomean speedup of 21.6x on a TPU

    mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis

    Full text link
    MLIR is an emerging compiler infrastructure for modern hardware, but existing programs cannot take advantage of MLIR's high-performance compilation if they are described in lower-level general purpose languages. Consequently, to avoid programs needing to be rewritten manually, this has led to efforts to automatically raise lower-level to higher-level dialects in MLIR. However, current methods rely on manually-defined raising rules, which limit their applicability and make them challenging to maintain as MLIR dialects evolve. We present mlirSynth -- a novel approach which translates programs from lower-level MLIR dialects to high-level ones without manually defined rules. Instead, it uses available dialect definitions to construct a program space and searches it effectively using type constraints and equivalences. We demonstrate its effectiveness \revi{by raising C programs} to two distinct high-level MLIR dialects, which enables us to use existing high-level dialect specific compilation flows. On Polybench, we show a greater coverage than previous approaches, resulting in geomean speedups of 2.5x (Intel) and 3.4x (AMD) over state-of-the-art compilation flows for the C programming language. mlirSynth also enables retargetability to domain-specific accelerators, resulting in a geomean speedup of 21.6x on a TPU

    Applications of Machine Learning in Pharmacogenomics: Clustering Plasma Concentration-Time Curves

    Full text link
    Pharmaceutical researchers are continually searching for techniques to improve both drug development processes and patient outcomes. An area of recent interest is the potential for machine learning (ML) applications within pharmacology. One such application not yet given close study is the unsupervised clustering of plasma concentration-time curves, hereafter, pharmacokinetic (PK) curves. In this paper, we present our findings on how to cluster PK curves by their similarity. Specifically, we find clustering to be effective at identifying similar-shaped PK curves and informative for understanding patterns within each cluster of PK curves. Because PK curves are time series data objects, our approach utilizes the extensive body of research related to the clustering of time series data as a starting point. As such, we examine many dissimilarity measures between time series data objects to find those most suitable for PK curves. We identify Euclidean distance as generally most appropriate for clustering PK curves, and we further show that dynamic time warping, Fr\'{e}chet, and structure-based measures of dissimilarity like correlation may produce unexpected results. As an illustration, we apply these methods in a case study with 250 PK curves used in a previous pharmacogenomic study. Our case study finds that an unsupervised ML clustering with Euclidean distance, without any subject genetic information, is able to independently validate the same conclusions as the reference pharmacogenomic results. To our knowledge, this is the first such demonstration. Further, the case study demonstrates how the clustering of PK curves may generate insights that could be difficult to perceive solely with population level summary statistics of PK metrics.Comment: 38 pages, 14 figures, 3 table

    Parvalbumin interneurons are differentially connected to principal cells in inhibitory feedback microcircuits along the dorso-ventral axis of the medial entorhinal cortex

    Get PDF
    The medial entorhinal cortex (mEC) shows a high degree of spatial tuning, predominantly grid cell activity, which is reliant on robust, dynamic inhibition provided by local interneurons (INs). In fact, feedback inhibitory microcircuits involving fast-spiking parvalbumin (PV) basket cells (BCs) are believed to contribute dominantly to the emergence of grid cell firing in principal cells (PrCs). However, the strength of PV BC-mediated inhibition onto PrCs is not uniform in this region, but high in the dorsal and weak in the ventral mEC. This is in good correlation with divergent grid field sizes, but the underlying morphologic and physiological mechanisms remain unknown. In this study, we examined PV BCs in layer (L)2/3 of the mEC characterizing their intrinsic physiology, morphology and synaptic connectivity in the juvenile rat. We show that while intrinsic physiology and morphology are broadly similar over the dorsoventral axis, PV BCs form more connections onto local PrCs in the dorsal mEC, independent of target cell type. In turn, the major PrC subtypes, pyramidal cell (PC) and stellate cell (SC), form connections onto PV BCs with lower, but equal probability. These data thus identify inhibitory connectivity as source of the gradient of inhibition, plausibly explaining divergent grid field formation along this dorsoventral axis of the mEC

    On the Geroch-Traschen class of metrics

    No full text
    We compare two approaches to semi-Riemannian metrics of low regularity. The maximally 'reasonable' distributional setting of Geroch and Traschen is shown to be consistently contained in the more general setting of nonlinear distributional geometry in the sense of Colombea

    The Promises of Hybrid Hexagonal/Classical Tiling for GPU

    Get PDF
    Time-tiling is necessary for efficient execution of iterative stencil computations. But the usual hyper-rectangular tiles cannot be used because of positive/negative dependence distances along the stencil's spatial dimensions. Several prior efforts have addressed this issue. However, known techniques trade enhanced data reuse for other causes of inefficiency, such as unbalanced parallelism, redundant computations, or increased control flow overhead incompatible with efficient GPU execution. We explore a new path to maximize the effectivness of time-tiling on iterative stencil computations. Our approach is particularly well suited for GPUs. It does not require any redundant computations, it favors coalesced global-memory access and data reuse in shared-memory/cache, avoids thread divergence, and extracts a high degree of parallelism. We introduce hybrid hexagonal tiling, combining hexagonal tile shapes along the time (sequential) dimension and one spatial dimension, with classical tiling for other spatial dimensions. An hexagonal tile shape simultaneously enable parallel tile execution and reuse along the time dimension. Experimental results demonstrate significant performance improvements over existing stencil compilers.Le partitionnement temporel est indispensable pour l'exécution efficace de stencils itératifs. En revanche les tuiles hyper-parallélépipédiques usuelles ne sont pas applicables en raison du mélange de dépendances en avant et en arriÚre suivant les dimensions spatiales du stencil. Plusieurs études ont été consacrées à ce problÚme. Pourtant, les techniques connues tendent à échanger une meilleure réutilisation des données contre d'autres sources d'inefficacité, telles que le déséquilibre du parallélisme, des calculs redondants, ou un surcoût induit par la complexité du flot de contrÎle incompatible avec l'exécution sur GPU. Nous explorons une autre voie pour maximiser l'efficacité du partitionnement temporel sur des stencils itératifs. Notre approche est particuliÚrement bien adaptée aux GPUs. Elle n'induit pas de calculs redondants, favorise l'agglomération des accÚs à la mémoire globale et la réutilisation de données dans les mémoires locales ou caches, tout en évitant la divergence de threads et en exposant un degré élevé de parallélisme. Nous proposons le partitionnement hybride hexagonal, qui repose sur des tuiles hexagonales selon la dimension temporelle (séquentielle) et une dimension spatiale, combinées avec un partitionnement classique selon les autres dimensions spatiales. La forme de tuile hexagonale autorise l'expression de parallélisme entre tuiles et la réutilisation selon la dimension temporelle. Nos résultats expérimentaux mettent en évidence des améliorations sensibles de performance par rapport aux compilateurs spécialisés dans l'optimisation de stencils

    A regularisation approach to causality theory for C^{1,1}Lorentzian metrics

    No full text
    We show that many standard results of Lorentzian causality theory remain valid if the regularity of the metric is reduced to C^{1,1}. Our approach is based on regularisations of the metric adapted to the causal structure
    • 

    corecore