640 research outputs found

    Generating and auto-tuning parallel stencil codes

    Get PDF
    In this thesis, we present a software framework, Patus, which generates high performance stencil codes for different types of hardware platforms, including current multicore CPU and graphics processing unit architectures. The ultimate goals of the framework are productivity, portability (of both the code and performance), and achieving a high performance on the target platform. A stencil computation updates every grid point in a structured grid based on the values of its neighboring points. This class of computations occurs frequently in scientific and general purpose computing (e.g., in partial differential equation solvers or in image processing), justifying the focus on this kind of computation. The proposed key ingredients to achieve the goals of productivity, portability, and performance are domain specific languages (DSLs) and the auto-tuning methodology. The Patus stencil specification DSL allows the programmer to express a stencil computation in a concise way independently of hardware architecture-specific details. Thus, it increases the programmer productivity by disburdening her or him of low level programming model issues and of manually applying hardware platform-specific code optimization techniques. The use of domain specific languages also implies code reusability: once implemented, the same stencil specification can be reused on different hardware platforms, i.e., the specification code is portable across hardware architectures. Constructing the language to be geared towards a special purpose makes it amenable to more aggressive optimizations and therefore to potentially higher performance. Auto-tuning provides performance and performance portability by automated adaptation of implementation-specific parameters to the characteristics of the hardware on which the code will run. By automating the process of parameter tuning — which essentially amounts to solving an integer programming problem in which the objective function is the number representing the code's performance as a function of the parameter configuration, — the system can also be used more productively than if the programmer had to fine-tune the code manually. We show performance results for a variety of stencils, for which Patus was used to generate the corresponding implementations. The selection includes stencils taken from two real-world applications: a simulation of the temperature within the human body during hyperthermia cancer treatment and a seismic application. These examples demonstrate the framework's flexibility and ability to produce high performance code

    From signals to music: a bottom-up approach to the structure of neuronal activity

    Get PDF
    Introduction: The search for the “neural code” has been a fundamental quest in neuroscience, concerned with the way neurons and neuronal systems process and transmit information. However, the term “code” has been mostly used as a metaphor, seldom acknowledging the formal definitions introduced by information theory, and the contributions of linguistics and semiotics not at all. The heuristic potential of the latter was suggested by structuralism, which turned the methods and findings of linguistics to other fields of knowledge. For the study of complex communication systems, such as human language and music, the necessity of an approach that considers multilayered, nested, structured organization of symbols becomes evident. We work under the hypothesis that the neural code might be as complex as these human-made codes. To test this, we propose a bottom-up approach, constructing a symbolic logic in order to translate neuronal signals into music scores. Methods: We recorded single cells’ activity from the rat’s globus pallidus pars interna under conditions of full alertness, blindfoldedness and environmental silence. We analyzed the signals with statistical, spectral, and complex methods, including Fast Fourier Transform, Hurst exponent and recurrence plot analysis. Results: The results indicated complex behavior and recurrence graphs consistent with fractality, and a Hurst exponent >0.5, evidencing temporal persistence. On the whole, these features point toward a complex behavior of the time series analyzed, also present in classical music, which upholds the hypothesis of structural similarities between music and neuronal activity. Furthermore, through our experiment we performed a comparison between music and raw neuronal activity. Our results point to the same conclusion, showing the structures of music and neuronal activity to be homologous. The scores were not only spontaneously tonal, but they exhibited structure and features normally present in human-made musical creations. Discussion: The hypothesis of a structural homology between the neural code and the code of music holds, suggesting that some of the insights introduced by linguistic and semiotic theory might be a useful methodological resource to go beyond the limits set by metaphoric notions of “code.”Fil: Noel, Gabriel David. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de San Martín. Instituto de Altos Estudios Sociales; ArgentinaFil: Mugno, Lionel E.. Conservatorio "Alfredo Luis Schiuma"; ArgentinaFil: Andres, Daniela Sabrina. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Instituto de Tecnologias Emergentes y Ciencias Aplicadas. - Universidad Nacional de San Martin. Instituto de Tecnologias Emergentes y Ciencias Aplicadas.; Argentina. Universidad Nacional de San Martin. Escuela de Ciencia y Tecnologia. Laboratorio de Neuroingenieria.; Argentin

    Analytical cost metrics: days of future past

    Get PDF
    2019 Summer.Includes bibliographical references.Future exascale high-performance computing (HPC) systems are expected to be increasingly heterogeneous, consisting of several multi-core CPUs and a large number of accelerators, special-purpose hardware that will increase the computing power of the system in a very energy-efficient way. Specialized, energy-efficient accelerators are also an important component in many diverse systems beyond HPC: gaming machines, general purpose workstations, tablets, phones and other media devices. With Moore's law driving the evolution of hardware platforms towards exascale, the dominant performance metric (time efficiency) has now expanded to also incorporate power/energy efficiency. This work builds analytical cost models for cost metrics such as time, energy, memory access, and silicon area. These models are used to predict the performance of applications, for performance tuning, and chip design. The idea is to work with domain specific accelerators where analytical cost models can be accurately used for performance optimization. The performance optimization problems are formulated as mathematical optimization problems. This work explores the analytical cost modeling and mathematical optimization approach in a few ways. For stencil applications and GPU architectures, the analytical cost models are developed for execution time as well as energy. The models are used for performance tuning over existing architectures, and are coupled with silicon area models of GPU architectures to generate highly efficient architecture configurations. For matrix chain products, analytical closed form solutions for off-chip data movement are built and used to minimize the total data movement cost of a minimum op count tree

    Simulation Intelligence: Towards a New Generation of Scientific Methods

    Full text link
    The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science

    Bioinformatics

    Get PDF
    This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here

    Characterisation of components and mechanisms involved in redox-regulation of protein import into chloroplasts

    Get PDF
    The vast majority of chloroplast proteins is encoded in the nucleus and thus has to be posttranslationally imported into the organelle, a process that is facilitated by two multimeric protein machineries, the Toc and Tic complexes (translocon at the outer/inner envelope of chloroplasts). Regulation of protein import, e.g. by redox signals, is a crucial step to adapt the protein content to the biochemical requirements of the organelle. In particular, one subunit of the Tic complex, Tic62, has been proposed as a redox sensor, whose possible function is to regulate protein import by sensing and reacting to the redox state of the organelle. To elucidate a potential redox regulation of protein import, structural features, redox-dependent properties and the evolutional origin of Tic62 were investigated. The results show that Tic62 consists of two very different modules: the N-terminal part was found to be mainly -helical and possesses dehydrogenase activity in vitro. It is furthermore an evolutionary ancient domain, as it is highly conserved in all photosynthetic organisms from flowering plants to cyanobacteria and even green sulfur bacteria. In contrast to this, the C-terminus is largely disordered and interacts specifically with ferredoxin-NADP+ oxidoreductase (FNR), a key enzyme in photosynthetic electron transfer reactions. Moreover, this domain was found to exist only in flowering plants, and thus the full-length Tic62 protein seems to be one of the evolutionary youngest Tic components. The results of this study make also clear that Tic62 is a target of redox regulation itself, as its localization and interaction properties depend on the metabolic redox state: oxidized conditions lead to fast membrane binding and interaction with the Tic complex, whereas reduced conditions cause solubilization of Tic62 into the stroma and increased interaction with FNR. This novel shuttling behaviour indicates a dynamic composition of the Tic complex. The NADP+/NADPH ratio was furthermore found to be able to influence the import efficiency of many precursor proteins. Interestingly, the import of not all preproteins depends on the stromal redox state. Hence it was proposed that not a single stable Tic translocon exists, but several Tic subcomplexes with different subunit compositions, which might mediate the import of different precursor groups in a redox-dependent or -independent fashion. Another redox signal that was analyzed in regard to an impact on protein import is the reversible reduction of disulfide bridges, which was found to affect the channel and receptor proteins of the Toc complex. The import of all proteins that use the Toc translocon for entering the chloroplast was shown to be influenced by disulfide bridge formation. Thus it can be concluded that a variety of redox signals, acting both on the Toc and Tic complexes, are able to influence chloroplast protein import

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF
    • …
    corecore