1,058 research outputs found

    Increasing the chemical content of turbulent flame models through the use of parallel computing

    Full text link

    Doctor of Philosophy

    Get PDF
    dissertationEmerging trends such as growing architectural diversity and increased emphasis on energy and power efficiency motivate the need for code that adapts to its execution context (input dataset and target architecture). Unfortunately, writing such code remains difficult, and is typically attempted only by a small group of motivated expert programmers who are highly knowledgeable about the relationship between software and its hardware mapping. In this dissertation, we introduce novel abstractions and techniques based on automatic performance tuning that enable both experts and nonexperts (application developers) to produce adaptive code. We present two new frameworks for adaptive programming: Nitro and Surge. Nitro enables expert programmers to specify code variants, or alternative implementations of the same computation, together with meta-information for selecting among them. It then utilizes supervised classification to select an optimal code variant at runtime based on characteristics of the execution context. Surge, on the other hand, provides a high-level nested data-parallel programming interface for application developers to specify computations. It then employs a two-level mechanism to automatically generate code variants and then tunes them using Nitro. The resulting code performs on par with or better than handcrafted reference implementations on both CPUs and GPUs. In addition to abstractions for expressing code variants, this dissertation also presents novel strategies for adaptively tuning them. First, we introduce a technique for dynamically selecting an optimal code variant at runtime based on characteristics of the input dataset. On five high-performance GPU applications, variants tuned using this strategy achieve over 93% of the performance of variants selected through exhaustive search. Next, we present a novel approach based on multitask learning to develop a code variant selection model on a target architecture from training on different source architectures. We evaluate this approach on a set of six benchmark applications and a collection of six NVIDIA GPUs from three distinct architecture generations. Finally, we implement support for combined code variant and frequency selection based on multiple objectives, including power and energy efficiency. Using this strategy, we construct a GPU sorting implementation that provides improved energy and power efficiency with less than a proportional drop in sorting throughput

    Polyhedral+Dataflow Graphs

    Get PDF
    This research presents an intermediate compiler representation that is designed for optimization, and emphasizes the temporary storage requirements and execution schedule of a given computation to guide optimization decisions. The representation is expressed as a dataflow graph that describes computational statements and data mappings within the polyhedral compilation model. The targeted applications include both the regular and irregular scientific domains. The intermediate representation can be integrated into existing compiler infrastructures. A specification language implemented as a domain specific language in C++ describes the graph components and the transformations that can be applied. The visual representation allows users to reason about optimizations. Graph variants can be translated into source code or other representation. The language, intermediate representation, and associated transformations have been applied to improve the performance of differential equation solvers, or sparse matrix operations, tensor decomposition, and structured multigrid methods

    Doctor of Philosophy

    Get PDF
    dissertationIn the static analysis of functional programs, control- ow analysis (k-CFA) is a classic method of approximating program behavior as a infinite state automata. CFA2 and abstract garbage collection are two recent, yet orthogonal improvements, on k-CFA. CFA2 approximates program behavior as a pushdown system, using summarization for the stack. CFA2 can accurately approximate arbitrarily-deep recursive function calls, whereas k-CFA cannot. Abstract garbage collection removes unreachable values from the store/heap. If unreachable values are not removed from a static analysis, they can become reachable again, which pollutes the final analysis and makes it less precise. Unfortunately, as these two techniques were originally formulated, they are incompatible. CFA2's summarization technique for managing the stack obscures the stack such that abstract garbage collection is unable to examine the stack for reachable values. This dissertation presents introspective pushdown control-flow analysis, which manages the stack explicitly through stack changes (pushes and pops). Because this analysis is able to examine the stack by how it has changed, abstract garbage collection is able to examine the stack for reachable values. Thus, introspective pushdown control-flow analysis merges successfully the benefits of CFA2 and abstract garbage collection to create a more precise static analysis. Additionally, the high-performance computing community has viewed functional programming techniques and tools as lacking the efficiency necessary for their applications. Nebo is a declarative domain-specific language embedded in C++ for discretizing partial differential equations for transport phenomena. For efficient execution, Nebo exploits a version of expression templates, based on the C++ template system, which is a type-less, completely-pure, Turing-complete functional language with burdensome syntax. Nebo's declarative syntax supports functional tools, such as point-wise lifting of complex expressions and functional composition of stencil operators. Nebo's primary abstraction is mathematical assignment, which separates what a calculation does from how that calculation is executed. Currently Nebo supports single-core execution, multicore (thread-based) parallel execution, and GPU execution. With single-core execution, Nebo performs on par with the loops and code that it replaces in Wasatch, a pre-existing high-performance simulation project. With multicore (thread-based) execution, Nebo can linearly scale (with roughly 90% efficiency) up to 6 processors, compared to its single-core execution. Moreover, Nebo's GPU execution can be up to 37x faster than its single-core execution. Finally, Wasatch (the pre-existing high-performance simulation project which uses Nebo) can scale up to 262K cores

    Modelling the live-electronics in electroacoustic music using particle systems

    Get PDF
    Contemporary music is largely influenced by technology. Empowered by the current available tools and resources, composers have the possibility to not only compose with sounds, but also to compose the sounds themselves. Personal computers powered with intuitive and interactive audio applications and development tools allow the creation of a vast range of real-time manipulation of live instrumental input and also real-time generation of sound through synthesis techniques. Consequently, achieving a desired sonority and interaction between the electronic and acoustic sounds in real-time, deeply rely on the choice and technical implementation of the audio processes and logical structures that will perform the electronic part of the composition. Due to the artistic and technical complexity of the development and implementation of such a complex artistic work, a very common strategy historically adopted by composers is to develop the composition in collaboration with a technology expert, which in this context is known as a musical assistant. In this perspective, the work of the musical assistant can be considered as one of translating musical, artistic and aesthetic concepts into mathematical algorithms and audio processes. The work presented in this dissertation addresses the problem of choosing, combining and manipulating the audio processes and logical structures that take place on the liveelectronics (i.e the electronic part of a mixed music composition) of a contemporary electroacoustic music composition, by using particle systems to model and simulate the dynamic behaviors that reflect the conceptual and aesthetic principles envisaged by the composer for a determined musical piece. The presented research work initiates with a thorough identification and analysis of the agents, processes and structures that are present in the live-electronics system of a mixed music composition. From this analysis a logical formalization of a typical live-electronics system is proposed, and then adapted to integrate a particle-based modelling strategy. From the formalization, a theoretical and practical framework for developing and implementing live-electronics systems for mixed music compositions using particle systems is proposed. The framework is experimented and validated in the development of distinct mixed music compositions by distinct composers, in real professional context. From the analysis of the case studies and the logical formalization, and the feedback given by the composers, it is possible to conclude that the proposed particle systems modelling method proves to be effective in the task of assisting the conceptual translation of musical and aesthetic ideas into implementable audio processing software.A música contemporânea é amplamente influenciada pela tecnologia. Os recursos tecnológicos atualmente disponíveis permitem que os compositores criem com sons e ao mesmo tempo criem os sons em si próprios. Os atuais aplicativos e ferramentas de software focados no desenvolvimento, controle e manipulação de processamentos de áudio, permitem a elaboração de diversos tipos de tratamentos e sínteses de som com a capacidade de serem executados e manipulados em tempo real. Consequentemente, a escolha dos algoritmos de processamento de áudio e suas respectivas implementações técnicas em forma de software, são determinantes para que a sonoridade desejada seja atingida, e para que o resultado sonoro satisfaça os objetivos estéticos e conceituais da relação entre as fontes sonoras acústicas e os sons eletrônicos presentes em uma composição eletroacústica de caráter misto. Devido à complexidade artística e técnica do desenvolvimento e implementação do sistema de eletrônica em tempo real de uma composição eletroacústica mista, uma estratégia historicamente adotada por compositores é a de desenvolver a composição em colaboração com um especialista em tecnologia, que neste contexto é usualmente referido como assistente musical. Nesta perspectiva, o trabalho do assistente musical pode ser interpretado como o de traduzir conceitos musicais, artísticos e estéticos em algoritmos matemáticos e processamento de áudio. O trabalho apresentado nesta dissertação aborda a problemática da escolha, combinação e manipulação dos processamentos de áudio e estruturas lógicas presentes no sistema de eletrônica em tempo real de uma composição de música eletroacústica contemporânea, e propõem o uso de sistemas de partículas para modelar e simular os comportamentos dinâmicos e morfológicos que refletem os princípios conceituais e estéticos previstos pelo compositor para uma determinada composição. A parte inicial do trabalho apresentado consiste na identificação e análise detalhada dos agentes, estruturas e processos envolvidos na realização e execução do sistema de eletrônica em tempo real. A partir desta análise é proposta uma formalização lógica e genérica de um sistema de eletrônica em tempo real. Em seguida, esta formalização é modificada e adaptada para integrar uma estratégia de modelagem através de sistemas de partículas. Em sequencia da formalização lógica, um método teórico e prático para o desenvolvimento de sistemas de eletrônica em tempo real para composições de música mista é proposto. O teste e consequente validação do método se dá através de sua utilização na realização da eletrônica em tempo real para obras de diferentes compositores. A análise dos casos de estudo e da formalização lógica, e também o parecer e opinião dos compositores, permitem concluir que o método proposto é de fato eficaz na tarefa de auxiliar o processo de tradução dos conceitos musicais e estéticos propostos pelos compositores em forma de algoritmos e processamentos de som implementados em software