Search CORE

9 research outputs found

Predicting the Performance of Reconfigurable Interconnects in Distributed Shared-Memory Systems

Author: ARTUNDO I
Dambre Joni
DEBAES C
Heirman Wim
Stroobandt Dirk
Thienpont Hugo
Van Campenhout Jan
Publication venue
Publication date: 01/01/2006
Field of study

Ghent University Academic Bibliography

Archivsystem Ask23

Toward fast and accurate architecture exploration in a hardware/software codesign flow

Author: Stroobandt Dirk
Publication venue: 'World Scientific and Engineering Academy and Society (WSEAS)'
Publication date: 01/01/2002
Field of study

Ghent University Academic Bibliography

Reconfigurable interconnects in DSM systems: a focus on context switch behavior

Author: Artundo I
Dambre Joni
Debaes C
Heirman Wim
Manjarres D
Thienpont H
Van Campenhout Jan
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

Recent advances in the development of reconfigurable optical interconnect technologies allow for the fabrication of low cost and run-time adaptable interconnects in large distributed shared-memory (DSM) multiprocessor machines. This can allow the use of adaptable interconnection networks that alleviate the huge bottleneck present due to the gap between the processing speed and the memory access time over the network. In this paper we have studied the scheduling of tasks by the kernel of the operating system (OS) and its influence on communication between the processing nodes of the system, focusing on the traffic generated just after a context switch. We aim to use these results as a basis to propose a potential reconfiguration of the network that could provide a significant speedup

Crossref

Ghent University Academic Bibliography

Predictie van interconnectie-eigenschappen van digitale schakelingen voor de exploratie van ontwerpkeuzes en technologie

Author: Dambre Joni
Publication venue
Publication date: 01/01/2003
Field of study

Ghent University Academic Bibliography

Copper wafer bonding in three-dimensional integration

Author: Chen Kuan-Neng, 1974-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 165-176).Three-dimensional (3D) integration, in which multiple layers of devices are stacked with high density of interconnects between the layers, offers solutions for problems when the critical dimensions in integrated circuits keep shrinking. Copper wafer bonding has been considered as a strong candidate for fabrication of three-dimensional integrated circuits (3-D IC). This thesis work involves fundamental studies of copper wafer bonding and bonding performance of bonded interconnects. Copper bonded wafers exhibit good bonding qualities and present no original bonding interfaces when the bonding process occurs at 400⁰C/4000 mbar for 30 min, followed by nitrogen anneal at 400⁰C for 30 min. Oxide distribution in the bonded layer is uniform and sparse. Evolution of microstructure morphologies and grain orientations of copper bonded wafers during bonding and annealing were studied. The bonded layer reaches steady state after post-bonding anneal. The microstructure morphologies and bond strengths of copper bonded wafers under different bonding conditions were investigated.A map summarizing these results provides a useful reference on process conditions suitable for three-dimensional integration based on copper wafer bonding. Similar microstructure morphology of copper bonded interconnects was observed to that of copper bonded wafers. Specific contact resistances of bonded interconnects of approximately 10⁻⁸ [ohms]-cm² were measured by using a novel test structure which can eliminate the errors from misalignment during bonding. The bonding qualities of different interconnect sizes and densities have been investigated. In addition to increasing the bonding temperature and duration, options such as larger interconnect sizes, total bonding area, or use of dummy pads for bonding in the unused area improve the quality of bonded interconnects. Process development of silicon layer stacking based on Cu wafer bonding was successfully applied to demonstrate a strong four-layer-stack structure.Bonded Cu layers in this structure become homogeneous layers and do not show original bonding interfaces. This process can be reliably applied in three-dimensional integration applications.by Kuan-Neng Chen.Ph.D

DSpace@MIT

Untersuchungen zur Kostenoptimierung für Hardware-Emulatoren durch Anwendung von Methoden der partiellen Laufzeitrekonfiguration

Author: Beckert René
Publication venue: TUDpress
Publication date: 13/06/2013
Field of study

Der vorliegende Band der wissenschaftlichen Schriftenreihe Eingebettete Selbstorganisierende Systeme widmet sich der Optimierung von Hardware Emulatoren durch die Anwendung von Methoden der partiellen Laufzeitrekonfiguration. An aktuelle Schaltkreis- und Systementwürfe werden zunehmend divergente Anforderungen gestellt. Einer sehr kurzen Entwicklungszeit für eine schnelle Markteinführung steht, um teure und aufwändige Re-Desings zu verhindern, eine möglichst umfangreiche Testabdeckung des Entwurfs gegenüber. Um die Zeit für die Tests zu reduzieren, kommen überwiegend FPGA-basierte HW-Emulatoren zum Einsatz. Durch den Einfluss der steigenden Komplexität aktueller Entwürfe auf die Emulator-Plattform reduziert sich jedoch signifikant die Performance der Emulatoren. Die in Emulatoren eingesetzten FPGAs sind aber zunehmend partiell zur Laufzeit rekonfigurierbar. Der in der vorliegenden Arbeit umgesetzte Ansatz behandelt die Anwendung von Methoden der Laufzeitrekonfiguration auf dem Gebiet der Hardware-Emulation. Dafür ist zunächst eine Partitionierung des zu testenden Entwurfs in möglichst funktional unabhängige Systemteile notwendig. Für eine optimierte und ressourceneffiziente Platzierung der einzelnen HW-Module während der Emulation, ist ein ebenfalls auf dem FPGA platziertes Kommunikationsnetzwerk implementiert. Der vorgestellte Ansatz wird an verschiedenen Beispielen anschaulich illustriert. So kann der Leser die Mächtigkeit der entwickelten Methodik nachvollziehen und wird motiviert, das Verfahren auch auf weitere Anwendungsfälle zu übertragen.Current circuit and system designs consist a lot of gate numbers and divergent requirements. In contrast to a short development and time to market schedule, the needs for perfect test coverage and quality are rising. One approach to cover this problem is the FPGA based functional test of electronic circuits. State of the art FPGA platforms doesn't consist enough gates to support fully custom designs. The thesis catches this problem and gives some approaches to use partial dynamic reconfiguration to solve the size problem. A fully automated design flow demonstrates partial partitioning of designs, modifications to use dynamic reconfiguration and its schedule. At the end of the work, some examples demonstrates the power of the approach

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Multimedia ONline ARchiv CHemnitz

Dynamically reconfigurable asynchronous processor

Author: Fawaz Khodor Ahmad
Publication venue: The University of Edinburgh
Publication date: 25/06/2012
Field of study

The main design requirements for today's mobile applications are: · high throughput performance. · high energy efficiency. · high programmability. Until now, the choice of platform has often been limited to Application-Specific Integrated Circuits (ASICs), due to their best-of-breed performance and power consumption. The economies of scale possible with these high-volume markets have traditionally been able to hide the high Non-Recurring Engineering (NRE) costs required for designing and fabricating new ASICs. However, with the NREs and design time escalating with each generation of mobile applications, this practice may be reaching its limit. Designers today are looking at programmable solutions, so that they can respond more rapidly to changes in the market and spread costs over several generations of mobile applications. However, there have been few feasible alternatives to ASICs: Digital Signals Processors (DSPs) and microprocessors cannot meet the throughput requirements, whereas Field-Programmable Gate Arrays (FPGAs) require too much area and power. Coarse-grained dynamically reconfigurable architectures offer better solutions for high throughput applications, when power and area considerations are taken into account. One promising example is the Reconfigurable Instruction Cell Array (RICA). RICA consists of an array of cells with an interconnect that can be dynamically reconfigured on every cycle. This allows quite complex datapaths to be rendered onto the fabric and executed in a single configuration - making these architectures particularly suitable to stream processing. Furthermore, RICA can be programmed from C, making it a good fit with existing design methodologies. However the RICA architecture has a drawback: poor scalability in terms of area and power. As the core gets bigger, the number of sequential elements in the array must be increased significantly to maintain the ability to achieve high throughputs through pipelining. As a result, a larger clock tree is required to synchronise the increased number of sequential elements. The clock tree therefore takes up a larger percentage of the area and power consumption of the core. This thesis presents a novel Dynamically Reconfigurable Asynchronous Processor (DRAP), aimed at high-throughput mobile applications. DRAP is based on the RICA architecture, but uses asynchronous design techniques - methods of designing digital systems without clocks. The absence of a global clock signal makes DRAP more scalable in terms of power and area overhead than its synchronous counterpart. The DRAP architecture maintains most of the benefits of custom asynchronous design, whilst also providing programmability via conventional high-level languages. Results show that the DRAP processor delivers considerably lower power consumption when compared to a market-leading Very Long Instruction Word (VLIW) processor and a low-power ARM processor. For example, DRAP resulted in a reduction in power consumption of 20 times compared to the ARM7 processor, and 29 times compared to the TIC64x VLIW, when running the same benchmark capped to the same throughput and for the same process technology (0.13μm). When compared to an equivalent RICA design, DRAP was up to 22% larger than RICA but resulted in a power reduction of up to 1.9 times. It was also capable of achieving up to 2.8 times higher throughputs than RICA for the same benchmarks

Edinburgh Research Archive

Special issue on System-Level Interconnect Prediction

Author: Markov Igor
Scheffer Louis K
Stroobandt Dirk
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography

Special issue on System-Level Interconnect Prediction

Author: Dirk Stroobandt
Igor L. Markov
Louis K. Scheffer
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref