109 research outputs found
Towards a liquid compiler
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.Includes bibliographical references (leaves 41-42).by Stephen Brooks Davis.M.S
Reference Capabilities for Flexible Memory Management: Extended Version
Verona is a concurrent object-oriented programming language that organises
all the objects in a program into a forest of isolated regions. Memory is
managed locally for each region, so programmers can control a program's memory
use by adjusting objects' partition into regions, and by setting each region's
memory management strategy. A thread can only mutate (allocate, deallocate)
objects within one active region -- its "window of mutability". Memory
management costs are localised to the active region, ensuring overheads can be
predicted and controlled. Moving the mutability window between regions is
explicit, so code can be executed wherever it is required, yet programs remain
in control of memory use. An ownership type system based on reference
capabilities enforces region isolation, controlling aliasing within and between
regions, yet supporting objects moving between regions and threads. Data
accesses never need expensive atomic operations, and are always thread-safe.Comment: 87 pages, 10 figures, 5 listings, 4 tables. Extended version of paper
to be published at OOPSLA 202
Parallel hierarchical radiosity rendering
The radiosity equation is examined, and is found to contain a previously unexploited symmetry. This symmetry is formalized, and a solution method previously unusable in the field of computer graphics (conjugate gradients) is shown to be superior to all methods currently in use. A detailed analysis of all solution techniques previously applied to the radiosity problem is conducted, and results presented;So-called hierarchical methods have reduced the operational complexity of the N-body problem from O(N[superscript]2) to O(N log N) assuming a pre-set error tolerance. An algorithm following the same basic tenets has been applied to radiosity rendering by other researchers, and has reduced the operational complexity from O(N[superscript]2) to (arguably) O(N);Shortcomings in the state-of-the-art hierarchical radiosity method are pointed out, and enhancements are offered. A consistent treatment of various types of error is found to be absent from present methods. Catastrophic error is possible in the visibility assessment between two polygons. A self-consistency check is possible during the solution process, but never exploited;Until now, supercomputer-class computers have not been used to solve radiosity problems at a production-quality level even though realistic image synthesis has always been a prodigious consumer of computer time. A state-of-the-art hierarchical radiosity code is implemented on an nCUBE-2 parallel computer, and discussed in detail. The algorithm is found to have ample sources of parallelism, in both data- and operational modes. Its performance is analyzed in detail;The hierarchical method has only been applied to realistic image synthesis since 1991. Not surprisingly, many avenues of further research are open. Some are pointed out, and include: analytic determination of coupling factors, quantifying discretization error, incorporating specular light reflection modes into the hierarchical treatment, and exploring what other important physical problems might benefit from the hierarchical approach
DynaSOAr: A Parallel Memory Allocator for Object-Oriented Programming on GPUs with Efficient Memory Access (Artifact)
This artifact contains the source code of DynaSOAr, a CUDA framework for Single-Method Multiple-Objects (SMMO) applications. SMMO is a type of object-oriented programs in which parallelism is expressed by running the same method on all applications of a type.
DynaSOAr is a dynamic memory allocator, combined with a data layout DSL and a parallel do-all operation. This artifact provides a tutorial explaining the API of DynaSOAr, along with nine benchmark applications from different domains. All benchmarks can be configured to use a different memory allocator to allow for a comparison with other state-of-the-art memory allocators
RA-LPEL: A Resource-Aware Light-Weight Parallel Execution Layer for Reactive Stream Processing Networks on The SCC Many-core Tiled Architecture
In computing the available computing power has continuously fallen short of the demanded computing performance. As a consequence, performance improvement has been the main focus of processor design. However, due to the phenomenon called “Power Wall” it has become infeasible to build faster processors by just increasing the
processor’s clock speed. One of the resulting trends in hardware design is to integrate several simple and power-efficient cores on the same chip. This design shift poses challenges of its own. In the past, with increasing clock frequency the programs became automatically faster as well without modifications. This is no longer true with many-core architectures. To achieve maximum performance the programs have to run concurrently on more than one core, which forces the general computing paradigm to
become increasingly parallel to leverage maximum processing power.
In this thesis, we will focus on the Reactive Stream Program (RSP). In stream processing, the system consists of computing nodes, which are connected via communication streams. These streams simplify the concurrency management on modern many-core architectures due to their implicit synchronisation. RSP is a stream processing system that implements the reactive system. The RSPs work in tandem with their environment and the load imposed by the environment may vary over time. This provides a unique opportunity to increase performance per watt. In this thesis the
research contribution focuses on the design of the execution layer to run RSPs on tiled many-core architectures, using the Intel’s Single-chip Cloud Computer (SCC) processor as a concrete experimentation platform. Further, we have developed a
Dynamic Voltage and Frequency Scaling (DVFS) technique for RSP deployed on many-core architectures. In contrast to many other approaches, our DVFS technique does not require the capability of controlling the power settings of individual computing elements, thus making it applicable for modern many-core architectures, with
which power can be changed only for power islands. The experimental results confirm that the proposed DVFS technique can effectively improve the energy efficiency, i.e. increase the performance per watt, for RSPs
Multidisziplinäre Simulation des Wirbelschleppen Durchfluges eines Flugzeuges mit dem DLR TAU-Code
Ausgangssituation: Für die Auslegung eines Flugzeuges sind eine Vielzahl unterschiedlicher Lastfälle zu berücksichtigen. Auf der einen Seite wird das Flugzeug für den Reiseflug optimiert, um eine möglichst große Reichweite bei geringem Brennstoffverbrauch zu erzielen. Auf der anderen Seite muss sichergestellt werden, dass ein Flugzeug auch in kritischen Situationen, wie beispielsweise der Begegnung mit einer kräftigen Böe oder der Wirbelschleppe eines voreilenden Flugzeuges, beherrschbar ist und den zusätzlichen Belastungen standhält. Um die zusätzlichen aerodynamischen Lasten vorherzusagen, werden heute in der Regel vereinfachte Methoden basierend auf Streifentheorie oder Doublet-Lattice-Methoden verwendet. Dadurch sind insbesondere bei hohen Fluggeschwindigkeiten (Kompressibilitätseffekte, Nichtlinearitäten) Vorhersagefehler der einfachen Methoden zu erwarten, weshalb entsprechend hohe Sicherheitsfaktoren aufgeschlagen werden. Das führt unter Umständen zu einer deutlichen Überdimensionierung der Struktur, und damit zu einem erhöhten Flugzeuggewicht.
Ziel: Um die Genauigkeit bei der Vorhersage der zusätzlichen durch Wirbelschleppen induzierten Lasten gegenüber oben angesprochenen einfachen Verfahren zu verbessern, soll im DLR RANS-Löser TAU die Möglichkeit geschaffen werden, Wirbelschleppen-Begegnungen von Flugzeugen zu simulieren. Dabei soll auch die Reaktion des Flugzeuges in Folge der Lasten durch Kopplung zur Flugmechanik Berücksichtigung finden.
Lösungsweg: Verschiedene Autoren haben in Euler- bzw. RANS-Verfahren den sogenannten Störgeschwindigkeitsansatz implementiert, bei dem die durch Böen induzierten Störungen in Form von Störgeschwindigkeiten als Funktion vom Raum und der Zeit vorgegeben werden können. Von Vorteil ist, das die atmosphärischen Störungen in der Simulation im Strömungsfeld nicht numerisch aufgelöst werden müssen. Es können Standardnetze Verwendung finden, was gegenüber der Auflösung der atmosphärischen Störungen eine effiziente numerische Behandlung verspricht. Dieses Verfahren ist für Böen-Begegnungen auch im TAU-Code implementiert und erfolgreich eingesetzt worden. Inzwischen ist es für Wirbelschleppenbegegnungen erweitert worden. Die durch die Wirbelschleppe induzierten Geschwindigkeiten werden durch Überlagerung zweier gegenläufiger „Burnham-Halloc“ Wirbel modelliert.
Als Beispiel fĂĽr einen Wirbelschleppen-Durchflug wurde die Interaktion eines generischen Kampfflugzeuges mit einer Wirbelschleppe eines voraus fliegenden Flugzeuges erfolgreich demonstriert. Neben der Aerodynamik wird auch die Flugmechanik berĂĽcksichtigt, um die Reaktion des Flugzeuges in Folge der Wirbelschleppe und von Steuerbewegungen zu erfassen
A Computer program for the extraction of bipolar transistor SPICE models
Each year semiconductor manufacturers spend millions of dollars in the development of new products. It can be very costly to create on silicon a newly developed circuit, especially if it is not very manufacturable or poorly designed. To eliminate risk it is of utmost importance that circuit simulation be done correctly in the early stages of development. Simulation can only be as good as the model being used. Almost all designers whether digital, analogue, or mixed mode use SPICE to simulate their circuits. SPICE accuracy is inherently dependent on the discrete element models being used, i.e. transistors, diodes, mosfets. Development of models usually includes physical measurements, SPICE parameter extraction from the measurement data, and then SPICE simulation to verify the extracted model parameters. This can be a tedious and time consuming process. To speed up this process as well as make it much easier to accomplish, a computer program has been written to aid in the extraction of SPICE parameters for bipolar transistors. To use the program all that is needed is Gummel-Poon data, collector current vs collector-emitter bias voltage data, junction capacitance vs voltage data, and Ft vs collector current data. The user can then methodically choose a parameter of interest and vary the value and immediately see the effect on simulation on screen in a graphical presentation. The user has the ability to simulate each of the above mentioned measurements and view the simulation simultaneously with the data. Using this technique the user can develop the entire SPICE model, including temperature effects and gain a very good working knowledge of parameter effects on the SPICE simulation
Pázmány Péter: Az nagi Calvinus Ianosnac hiszec egi Istene (1609)
Pázmány vitairatának nyomdailag tördelt szövege
- …