10 research outputs found
Fast Fourier transform on a 3D FPGA
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 53-55).Fast Fourier Transforms perform a vital role in many applications from astronomy to cellphones. The complexity of these algorithms results from the many computational steps, including multiplications, they require and, as such, many researchers focus on implementing better FFT systems. However, all research to date focuses on the algorithm within a 2-Dimensional architecture ignoring the opportunities available in recently proposed 3-Dimensional implementation technologies. This project examines FFTs in a 3D context, developing an architecture on a Field Programmable Gate Array system, to demonstrate the advantage of a 3D system.by Elizabeth Basha.S.M
Wiring requirement and three-dimensional integration technology for field programmable gate arrays
In this paper analytical models for predicting interconnect requirements in field-programmable gate arrays (FPGAs) are presented, and opportunities for 3-D implementation of FPGAs are examined. The analytical models for 2-D FPGAs are calibrated by routing and placement experiments with benchmark circuits and extended to 3-D FPGAs. Based on system-level modeling, we find that in FPGAs with 20K 4-input look-up tables, the reduction in channel width, interconnect delay, and power dissipation can be over 50% by 3-D implementation
Book of Knowledge (BOK) for NASA Electronic Packaging Roadmap
The objective of this document is to update the NASA roadmap on packaging technologies (initially released in 2007) and to present the current trends toward further reducing size and increasing functionality. Due to the breadth of work being performed in the area of microelectronics packaging, this report presents only a number of key packaging technologies detailed in three industry roadmaps for conventional microelectronics and a more recently introduced roadmap for organic and printed electronics applications. The topics for each category were down-selected by reviewing the 2012 reports of the International Technology Roadmap for Semiconductor (ITRS), the 2013 roadmap reports of the International Electronics Manufacturing Initiative (iNEMI), the 2013 roadmap of association connecting electronics industry (IPC), the Organic Printed Electronics Association (OE-A). The report also summarizes the results of numerous articles and websites specifically discussing the trends in microelectronics packaging technologies
Recommended from our members
Exploration Of Energy And Area Efficient Techniques For Coarse-grained Reconfigurable Fabrics
Coarse-grained fabrics are comprised of multi-bit configurable logic blocks and configurable interconnect. This work is focused on area and energy optimization techniques for coarse-grained reconfigurable fabric architectures. In this work, a variety of design techniques have been explored to improve the utilization of computational resources and increase energy savings. This includes splitting, folding, multi-level vertical interconnect. In addition to this, I have also studied fully connected homogeneous and heterogeneous architectures, and 3D architecture. I have also examined some of the hybrid strategies of computation unit’s arrangements. In order to perform energy and area analysis, I selected a set of signal and image processing benchmarks from MediaBench suite. I implemented various fabric architectures on 90nm ASIC process from Synopsys. Results show area improvement with energy savings as compared to baseline architecture
A Field Programmable Gate Array Architecture for Two-Dimensional Partial Reconfiguration
Reconfigurable machines can accelerate many applications by adapting to their needs through hardware reconfiguration. Partial reconfiguration allows the reconfiguration of a portion of a chip while the rest of the chip is busy working on tasks. Operating system models have been proposed for partially reconfigurable machines to handle the scheduling and placement of tasks. They are called OS4RC in this dissertation. The main goal of this research is to address some problems that come from the gap between OS4RC and existing chip architectures and the gap between OS4RC models and practical applications. Some existing OS4RC models are based on an impractical assumption that there is no data exchange channel between IP (Intellectual Property) circuits residing on a Field Programmable Gate Array (FPGA) chip and between an IP circuit and FPGA I/O pins. For models that do not have such an assumption, their inter-IP communication channels have severe drawbacks. Those channels do not work well with 2-D partial reconfiguration. They are not suitable for intensive data stream processing. And frequently they are very complicated to design and very expensive. To address these problems, a new chip architecture that can better support inter-IP and IP-I/O communication is proposed and a corresponding OS4RC kernel is then specified. The proposed FPGA architecture is based on an array of clusters of configurable logic blocks, with each cluster serving as a partial reconfiguration unit, and a mesh of segmented buses that provides inter-IP and IP-I/O communication channels. The proposed OS4RC kernel takes care of the scheduling, placement, and routing of circuits under the constraints of the proposed architecture. Features of the new architecture in turns reduce the kernel execution times and enable the runtime scheduling, placement and routing. The area cost and the configuration memory size of the new chip architecture are calculated and analyzed. And the efficiency of the OS4RC kernel is evaluated via simulation using three different task models
MICROELECTRONICS PACKAGING TECHNOLOGY ROADMAPS, ASSEMBLY RELIABILITY, AND PROGNOSTICS
This paper reviews the industry roadmaps on commercial-off-the shelf (COTS) microelectronics packaging technologies covering the current trends toward further reducing size and increasing functionality. Due tothe breadth of work being performed in this field, this paper presents only a number of key packaging technologies. The topics for each category were down-selected by reviewing reports of industry roadmaps including the International Technology Roadmap for Semiconductor (ITRS) and by surveying publications of the International Electronics Manufacturing Initiative (iNEMI) and the roadmap of association connecting electronics industry (IPC). The paper also summarizes the findings of numerous articles and websites that allotted to the emerging and trends in microelectronics packaging technologies. A brief discussion was presented on packaging hierarchy from die to package and to system levels. Key elements of reliability for packaging assemblies were presented followed by reliabilty definition from a probablistic failure perspective. An example was present for showing conventional reliability approach using Monte Carlo simulation results for a number of plastic ball grid array (PBGA). The simulation results were compared to experimental thermal cycle test data. Prognostic health monitoring (PHM) methods, a growing field for microelectronics packaging technologies, were briefly discussed. The artificial neural network (ANN), a data-driven PHM, was discussed in details. Finally, it presented inter- and extra-polations using ANN simulation for thermal cycle test data of PBGA and ceramic BGA (CBGA) assemblies
Design automation and analysis of three-dimensional integrated circuits
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 165-176).This dissertation concerns the design of circuits and systems for an emerging technology known as three-dimensional integration. By stacking individual components, dice, or whole wafers using a high-density electromechanical interconnect, three-dimensional integration can achieve scalability and performance exceeding that of conventional fabrication technologies. There are two main contributions of this thesis. The first is a computer-aided design flow for the digital components of a three-dimensional integrated circuit (3-D IC). This flow primarily consists of two software tools: PR3D, a placement and routing tool for custom 3-D ICs based on standard cells, and 3-D Magic, a tool for designing, editing, and testing physical layout characteristics of 3-D ICs. The second contribution of this thesis is a performance analysis of the digital components of 3-D ICs. We use the above tools to determine the extent to which 3-D integration can improve timing, energy, and thermal performance. In doing so, we verify the estimates of stochastic computational models for 3-D IC interconnects and find that the models predict the optimal 3-D wire length to within 20% accuracy. We expand upon this analysis by examining how 3-D technology factors affect the optimal wire length that can be obtained. Our ultimate analysis extends this work by directly considering timing and energy in 3-D ICs. In all cases we find that significant performance improvements are possible. In contrast, thermal performance is expected to worsen with the use of 3-D integration. We examine precisely how thermal behavior scales in 3-D integration and determine quantitatively how the temperature may be controlled during the circuit placement process. We also show how advanced packaging(cont.) technologies may be leveraged to maintain acceptable die temperatures in 3-D ICs. Finally, we explore two issues for the future of 3-D integration. We determine how technology scaling impacts the effect of 3-D integration on circuit performance. We also consider how to improve the performance of digital components in a mixed-signal 3-D integrated circuit. We conclude with a look towards future 3-D IC design tools.by Shamik Das.Ph.D
Dynamic and fault tolerant three-dimensional cellular genetic algorithms
In the area of artificial intelligence, the development of Evolutionary Algorithms (EAs) has
been very active, especially in the last decade. These algorithms started to evolve when
scientists from various regions of the world applied the principles of evolution to algorithmic
search and problem solving. EAs have been utilised successfully in diverse complex
application areas. Their success in tackling hard problems has been the engine of the field of
Evolutionary Computation (EC). Nowadays, EAs are considered to be the best solution to
use when facing a hard search or optimisation problem.
Various improvements are continually being made with the design of new operators,
hybrid models, among others. A very important example of such improvements is the use of
parallel models of GAs (PGAs). PGAs have received widespread attention from various
researchers as they have proved to be more effective than panmictic GAs, especially in terms
of efficacy and speedup.
This thesis focuses on, and investigates, cellular Genetic Algorithms (cGAs)-a
competitive variant of parallel GAs. In a cGA, the tentative solutions evolve in overlapped
neighbourhoods, allowing smooth diffusion of the solutions. The benefits derived from using
cGAs come not only from flexibility gains and their fitness to the objective target in
combination with a robust behaviour but also from their high performance and amenability
to implementation using advanced custom silicon chip technologies. Nowadays, cGAs are
considered as adaptable concepts for solving problems, especially complex optimisation
problems. Due to their structural characteristics, cGAs are able to promote an adequate
exploration/exploitation trade-off and thus maintain genetic diversity. Moreover, cGAs are
characterised as being massively parallel and easy to implement.
The structural characteristics inherited in a cGA provide an active area for investigation.
Because of the vital role grid structure plays in determining the effectiveness of the
algorithm, cellular dimensionality is the main issue to be investigated here. The
implementation of cGAs is commonly carried out on a one- or two-dimensional structure.
Studies that investigate higher cellular dimensions are lacking. Accordingly, this research
focuses on cGAs that are implemented on a three-dimensional structure. Having a structure with three dimensions, specifically a cubic structure, facilitates faster spreading of solutions
due to the shorter radius and denser neighbourhood that result from the vertical expansion of
cells. In this thesis, a comparative study of cellular dimensionality is conducted. Simulation
results demonstrate higher performance achieved by 3D-cGAs over their 2D-cGAs
counterparts. The direct implementation of 3D-cGAs on the new advanced 3D-IC
technology will provide added benefits such as higher performance combined with a
reduction in interconnection delays, routing length, and power consumption.
The maintenance of system reliability and availability is a major concern that must be
addressed. A system is likely to fail due to either hard or soft errors. Therefore, detecting a
fault before it deteriorates system performance is a crucial issue. Single Event Upsets
(SEUs), or soft errors, do not cause permanent damage to system functionality, and can be
handled using fault-tolerant techniques. Existing fault-tolerant techniques include hardware
or software fault tolerance, or a combination of both. In this thesis, fault-tolerant techniques
that mitigate SEUs at the algorithmic level are explored and the inherent abilities of cGAs to
deal with these errors are investigated. A fault-tolerant technique and several mitigation
techniques are also proposed, and faulty critical data are evaluated critical fault scenarios
(stuck at ‘1’ and stuck at ‘0’ faults) are taken into consideration. Chief among several test
and real world problems is the problem of determining the attitude of a vehicle using a
Global Positioning System (GPS), which is an example of hard real-time application. Results
illustrate the ability of cGAs to maintain their functionality and give an adequate
performance even with the existence of up to 40% errors in fitness score cells.
The final aspect investigated in this thesis is the dynamic characteristic of cGAs. cGAs,
and EAs in general, are known to be stochastic search techniques. Hence, adaptive systems
are required to continue to perform effectively in a changing environment, particularly when
tackling real-world problems. The adaptation in cellular engines is mainly achieved through
dynamic balancing between exploration and exploitation. This area has received
considerable attention from researchers who focus on improving the algorithmic
performance without incurring additional computational effort.
The structural properties and the genetic operations provide ways to control selection
pressure and, as a result, the exploration/exploitation trade-off. In this thesis, the genetic
operations of cGAs, particularly the selection aspect and their influence on the search
process, are investigated in order to dynamically control the exploration/exploitation trade-off.
Two adaptive-dynamic techniques that use genetic diversity and convergence speeds to
guide the search are proposed. Results obtained by evaluating the proposed approaches on a test bench of diverse-characteristic real-world and test problems showed improvement in
dynamic cGAs performance over their static counterparts and other dynamic cGAs. For
example, the proposed Diversity-Guided 3D-cGA outperformed all the other dynamic cGAs
evaluated by obtaining a higher search success rate that reached to 55%
Automated Design Space Exploration and Datapath Synthesis for Finite Field Arithmetic with Applications to Lightweight Cryptography
Today, emerging technologies are reaching astronomical proportions. For example, the Internet
of Things has numerous applications and consists of countless different devices using different
technologies with different capabilities. But the one invariant is their connectivity. Consequently,
secure communications, and cryptographic hardware as a means of providing them, are faced
with new challenges. Cryptographic algorithms intended for hardware implementations must be
designed with a good trade-off between implementation efficiency and sufficient cryptographic
strength. Finite fields are widely used in cryptography. Examples of algorithm design choices
related to finite field arithmetic are the field size, which arithmetic operations to use, how to
represent the field elements, etc. As there are many parameters to be considered and analyzed, an
automation framework is needed.
This thesis proposes a framework for automated design, implementation and verification of finite
field arithmetic hardware. The underlying motif throughout this work is “math meets hardware”.
The automation framework is designed to bring the awareness of underlying mathematical
structures to the hardware design flow. It is implemented in GAP, an open source computer algebra
system that can work with finite fields and has symbolic computation capabilities. The framework
is roughly divided into two phases, the architectural decisions and the automated design genera-
tion. The architectural decisions phase supports parameter search and produces a list of candidates.
The automated design generation phase is invoked for each candidate, and the generated VHDL
files are passed on to conventional synthesis tools. The candidates and their implementation results
form the design space, and the framework allows rapid design space exploration in a systematic
way. In this thesis, design space exploration is focused on finite field arithmetic.
Three distinctive features of the proposed framework are the structure of finite fields, tower field
support, and on the fly submodule generation. Each finite field used in the design is represented as
both a field and its corresponding vector space. It is easy for a designer to switch between fields
and vector spaces, but strict distinction of the two is necessary for hierarchical designs. When an
expression is defined over an extension field, the top-level module contains element signals and
submodules for arithmetic operations on those signals. The submodules are generated with
corresponding vector signals and the arithmetic operations are now performed on the coordinates.
For tower fields, the submodules are generated for the subfield operations, and the design is generated
in a top-down fashion. The binding of expressions to the appropriate finite fields or vector spaces
and a set of customized methods allow the on the fly generation of expressions for implementation
of arithmetic operations, and hence submodule generation.
In the light of NIST Lightweight Cryptography Project (LWC), this work focuses mainly on small
finite fields. The thesis illustrates the impact of hardware implementation results during the design
process of WAGE, a Round 2 candidate in the NIST LWC standardization competition. WAGE
is a hardware oriented authenticated encryption scheme. The parameter selection for WAGE was
aimed at balancing the security and hardware implementation area, using hardware implementation
results for many design decisions, for example field size, representation of field elements, etc.
In the proposed framework, the components of WAGE are used as an example to illustrate different
automation flows and demonstrate the design space exploration on a real-world algorithm
Hardware Implementations of the WG-16 Stream Cipher with Composite Field Arithmetic
The WG stream cipher family consists of stream ciphers
based on the Welch-Gong (WG) transformations that are used as a nonlinear filter applied to the output of a linear feedback shift register (LFSR). The aim of this thesis is an exploration of the design space of the WG-16 stream cipher. Five different representations of the field elements were analyzed, namely the polynomial basis representation, the normal basis representation and three isomorphic tower field constructions of F216: F(((22)2)2)2, F(24)4 and F(28)2. Each design option begins with an in-depth description of different field constructions and their impact on the top-level WG transformation circuit. Normal basis representation of elements for each level of the tower was chosen for field constructions F(((22)2)2)2 and F(24)4, and a mixed basis, with polynomial basis for the lower and normal basis for the higher level of the tower for F(28)2. Representation of field elements affects the field arithmetic, which in turn affects the entire design. Targeting high throughput, pipelined architectures were developed, and pipelining was based on the particular field construction: each extension over the prime field offers a new pipelining possibility. Pipelining at a lower level of the tower field reduces the clock period. Most flexible pipelining options are possible for F(((22)2)2)2, a highly regular
construction, which permits an algebraic optimization of the WG transformation resulting in two multiplications being removed. High speed, achieved by adequate pipelining granularity, and smaller area due to removed multipliers deem the F(((22)2)2)2 to be the most suitable field construction for the implementation of WG-16. The best WG-16 modules achieve a throughput of 222 Mbit/s with 476 slices used on the Xilinx Spartan-6 FPGA device xc6slx9 (using Xilinx Synthesis
Tool (XST) for synthesis and ISE for implementation [47]) and a throughput of 529 Mbit/s with area cost of 12215 GEs for ASIC implementation, using the 65 nm CMOS technology (using Synopsys Design Compiler for synthesis [45] and Cadence SoC Encounter to complete the Place-and-Route phase)