104 research outputs found
Peephole optimization of asynchronous macromodule networks
Journal ArticleAbstract- Most high-level synthesis tools for asynchronous circuits take descriptions in concurrent hardware description languages and generate networks of macromodules or handshake components. In this paper, we propose a peephole optimizer for these networks. Our peephole optimizer first deduces an equivalent blackbox behavior for the network using Dill's tracetheoretic parallel composition operator. It then applies a new procedure called burst-mode reduction to obtain burst-mode machines from the deduced behavior. In a significant number of examples, our optimizer achieves gate-count improvements by a factor of five, and speed (cycle-time) improvements by a factor of two. Burst-mode reduction can be applied to any macromodule network that is delay insensitive as well as deterministic. A significant number of asynchronous circuits, especially those generated by asynchronous high-level synthesis tools, fall into this class, thus making our procedure widely applicable
A full field, 3-D velocimeter for microgravity crystallization experiments
The programming and algorithms needed for implementing a full-field, 3-D velocimeter for laminar flow systems and the appropriate hardware to fully implement this ultimate system are discussed. It appears that imaging using a synched pair of video cameras and digitizer boards with synched rails for camera motion will provide a viable solution to the laminar tracking problem. The algorithms given here are simple, which should speed processing. On a heavily loaded VAXstation 3100 the particle identification can take 15 to 30 seconds, with the tracking taking less than one second. It seeems reasonable to assume that four image pairs can thus be acquired and analyzed in under one minute
Peephole optimization of asynchronous networks through process composition and burst-mode machine generation
Journal ArticleIn this paper, we discuss the problem of improving the efficiency of macromodule networks generated through asynchronous high level synthesis. We compose the behaviors of the modules in the sub-network being optimized using Dill's trace-theoretic operators to get a single behavioral description for the whole sub-network. From the composite trace structures so obtained, we obtain interface state graphs (ISG) (as described by Sutherland, Sproull, and Molnar), encode the ISGs to obtain encoded ISGs (EISGs), and then apply a procedure we have developed called Burst-mode machine reduction (BM-reduction) to obtain burstmode machines from EISGs. We then synthesize burst-mode machine circuits (currently) using the tool of Ken Yun (Stanford). We can report significant area- and time-improvements on a number of examples, as a result of our optimization method
Tecnologia adaptativa aplicada à otimização de código em compiladores
The programming memory space of embedded microcontrolled systems is usually limited.
Although, compilers nowadays apply optimizing transformations to the embedded software, the lack of memory space can become a critical problem to the designer with the introduction of new features and corrections in the original software. In contrast, workstations hosting development systems for embedded applications are faster and have much more memory. Given this scenario, we have developed a peephole optimizer exploring an adaptive technique that requires more memory and execution time, but is capable to achieve a better compression ratio of the object code than a conventional peephole optimizer. The introduction of an adaptive action enables the algorithm to self-modify its behavior in response to a specific input condition and to search the sequence of optimization rules that best optimizes the object code among the many possible sequences resulted from the superposition of two or more equally applicable optimization rules.O espaço de memória de programação de sistemas microcontrolados embutidos é normalmente limitado. Embora os compiladores atuais apliquem transformações otimizantes ao software embutido, a falta de espaço de memória pode se tornar um problema crítico para o projetista com a introdução de novas facilidades e correções no software original. Por outro lado, as estações de trabalho hospedando os sistemas de desenvolvimento para aplicações embutidas são mais rápidas e dispõem de mais memória. Diante deste panorama, desenvolvemos um otimizador peephole explorando uma técnica adaptativa que requer mais memória e tempo de execução, mas é capaz de obter uma melhor taxa de compressão do código objeto do que um otimizador peephole convencional. A introdução de uma ação adaptativa permite que o algoritmo auto modifique o seu comportamento em resposta a uma condição de entrada específica e procure a seqüência de regras de otimização que melhor otimiza o código objeto entre as muitas seqüências possíveis resultantes da superposição de duas ou mais regras de otimização igualmente aplicáveis.Eje: Teoría (TEOR)Red de Universidades con Carreras en Informática (RedUNCI
Tecnologia adaptativa aplicada à otimização de código em compiladores
The programming memory space of embedded microcontrolled systems is usually limited.
Although, compilers nowadays apply optimizing transformations to the embedded software, the lack of memory space can become a critical problem to the designer with the introduction of new features and corrections in the original software. In contrast, workstations hosting development systems for embedded applications are faster and have much more memory. Given this scenario, we have developed a peephole optimizer exploring an adaptive technique that requires more memory and execution time, but is capable to achieve a better compression ratio of the object code than a conventional peephole optimizer. The introduction of an adaptive action enables the algorithm to self-modify its behavior in response to a specific input condition and to search the sequence of optimization rules that best optimizes the object code among the many possible sequences resulted from the superposition of two or more equally applicable optimization rules.O espaço de memória de programação de sistemas microcontrolados embutidos é normalmente limitado. Embora os compiladores atuais apliquem transformações otimizantes ao software embutido, a falta de espaço de memória pode se tornar um problema crítico para o projetista com a introdução de novas facilidades e correções no software original. Por outro lado, as estações de trabalho hospedando os sistemas de desenvolvimento para aplicações embutidas são mais rápidas e dispõem de mais memória. Diante deste panorama, desenvolvemos um otimizador peephole explorando uma técnica adaptativa que requer mais memória e tempo de execução, mas é capaz de obter uma melhor taxa de compressão do código objeto do que um otimizador peephole convencional. A introdução de uma ação adaptativa permite que o algoritmo auto modifique o seu comportamento em resposta a uma condição de entrada específica e procure a seqüência de regras de otimização que melhor otimiza o código objeto entre as muitas seqüências possíveis resultantes da superposição de duas ou mais regras de otimização igualmente aplicáveis.Eje: Teoría (TEOR)Red de Universidades con Carreras en Informática (RedUNCI
Peephole optimization of asynchronous macromodule networks
Journal ArticleMost high level synthesis tools for asynchronous circuits take descriptions in concurrent hardware description languages and generate networks of macromodules or handshake components. In this paper we describe a peephole optimizer for such macromodule networks that often effects area and/or time improvements. Our optimizer first deduces an equivalent black-box behavior for the given network of macrmodules using Dill's trace-theoretic parallel composition operator. It then applies a new procedure culled Burst-mode reduction to obtain burst-mode machines, which can be synthesized into gate networks using available tools. Since burst-mode reduction can be applied to any macromodule network that is delay-insensitive as well as deterministic, our optimizer covers a significant number of asynchronous circuits especially those generated by asynchronous high level synthesis tools
Sidekick compilation with xDSL
Traditionally, compiler researchers either conduct experiments within an
existing production compiler or develop their own prototype compiler; both
options come with trade-offs. On one hand, prototyping in a production compiler
can be cumbersome, as they are often optimized for program compilation speed at
the expense of software simplicity and development speed. On the other hand,
the transition from a prototype compiler to production requires significant
engineering work. To bridge this gap, we introduce the concept of sidekick
compiler frameworks, an approach that uses multiple frameworks that
interoperate with each other by leveraging textual interchange formats and
declarative descriptions of abstractions. Each such compiler framework is
specialized for specific use cases, such as performance or prototyping.
Abstractions are by design shared across frameworks, simplifying the transition
from prototyping to production. We demonstrate this idea with xDSL, a sidekick
for MLIR focused on prototyping and teaching. xDSL interoperates with MLIR
through a shared textual IR and the exchange of IRs through an IR Definition
Language. The benefits of sidekick compiler frameworks are evaluated by showing
on three use cases how xDSL impacts their development: teaching, DSL
compilation, and rewrite system prototyping. We also investigate the trade-offs
that xDSL offers, and demonstrate how we simplify the transition between
frameworks using the IRDL dialect. With sidekick compilation, we envision a
future in which engineers minimize the cost of development by choosing a
framework built for their immediate needs, and later transitioning to
production with minimal overhead
Recommended from our members
COMPASS: A Community-driven Parallelization Advisor for Sequential Software
The widespread adoption of multicores has renewed the emphasis on the use of parallelism to improve performance. The present and growing diversity in hardware architectures and software environments, however, continues to pose difficulties in the effective use of parallelism thus delaying a quick and smooth transition to the concurrency era. In this paper, we describe the research being conducted at Columbia University on a system called COMPASS that aims to simplify this transition by providing advice to programmers while they reengineer their code for parallelism. The advice proffered to the programmer is based on the wisdom collected from programmers who have already parallelized some similar code. The utility of COMPASS rests, not only on its ability to collect the wisdom unintrusively but also on its ability to automatically seek, find and synthesize this wisdom into advice that is tailored to the task at hand, i.e., the code the user is considering parallelizing and the environment in which the optimized program is planned to execute. COMPASS provides a platform and an extensible framework for sharing human expertise about code parallelization — widely, and on diverse hardware and software. By leveraging the "wisdom of crowds" model, which has been conjectured to scale exponentially and which has successfully worked for wikis, COMPASS aims to enable rapid propagation of knowledge about code parallelization in the context of the actual parallelization reengineering, and thus continue to extend the benefits of Moore's law scaling to science and society
- …