10 research outputs found

    Fast Fourier transform on a 3D FPGA

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 53-55).Fast Fourier Transforms perform a vital role in many applications from astronomy to cellphones. The complexity of these algorithms results from the many computational steps, including multiplications, they require and, as such, many researchers focus on implementing better FFT systems. However, all research to date focuses on the algorithm within a 2-Dimensional architecture ignoring the opportunities available in recently proposed 3-Dimensional implementation technologies. This project examines FFTs in a 3D context, developing an architecture on a Field Programmable Gate Array system, to demonstrate the advantage of a 3D system.by Elizabeth Basha.S.M

    Wiring requirement and three-dimensional integration technology for field programmable gate arrays

    No full text
    In this paper analytical models for predicting interconnect requirements in field-programmable gate arrays (FPGAs) are presented, and opportunities for 3-D implementation of FPGAs are examined. The analytical models for 2-D FPGAs are calibrated by routing and placement experiments with benchmark circuits and extended to 3-D FPGAs. Based on system-level modeling, we find that in FPGAs with 20K 4-input look-up tables, the reduction in channel width, interconnect delay, and power dissipation can be over 50% by 3-D implementation

    Book of Knowledge (BOK) for NASA Electronic Packaging Roadmap

    Get PDF
    The objective of this document is to update the NASA roadmap on packaging technologies (initially released in 2007) and to present the current trends toward further reducing size and increasing functionality. Due to the breadth of work being performed in the area of microelectronics packaging, this report presents only a number of key packaging technologies detailed in three industry roadmaps for conventional microelectronics and a more recently introduced roadmap for organic and printed electronics applications. The topics for each category were down-selected by reviewing the 2012 reports of the International Technology Roadmap for Semiconductor (ITRS), the 2013 roadmap reports of the International Electronics Manufacturing Initiative (iNEMI), the 2013 roadmap of association connecting electronics industry (IPC), the Organic Printed Electronics Association (OE-A). The report also summarizes the results of numerous articles and websites specifically discussing the trends in microelectronics packaging technologies

    A Field Programmable Gate Array Architecture for Two-Dimensional Partial Reconfiguration

    Get PDF
    Reconfigurable machines can accelerate many applications by adapting to their needs through hardware reconfiguration. Partial reconfiguration allows the reconfiguration of a portion of a chip while the rest of the chip is busy working on tasks. Operating system models have been proposed for partially reconfigurable machines to handle the scheduling and placement of tasks. They are called OS4RC in this dissertation. The main goal of this research is to address some problems that come from the gap between OS4RC and existing chip architectures and the gap between OS4RC models and practical applications. Some existing OS4RC models are based on an impractical assumption that there is no data exchange channel between IP (Intellectual Property) circuits residing on a Field Programmable Gate Array (FPGA) chip and between an IP circuit and FPGA I/O pins. For models that do not have such an assumption, their inter-IP communication channels have severe drawbacks. Those channels do not work well with 2-D partial reconfiguration. They are not suitable for intensive data stream processing. And frequently they are very complicated to design and very expensive. To address these problems, a new chip architecture that can better support inter-IP and IP-I/O communication is proposed and a corresponding OS4RC kernel is then specified. The proposed FPGA architecture is based on an array of clusters of configurable logic blocks, with each cluster serving as a partial reconfiguration unit, and a mesh of segmented buses that provides inter-IP and IP-I/O communication channels. The proposed OS4RC kernel takes care of the scheduling, placement, and routing of circuits under the constraints of the proposed architecture. Features of the new architecture in turns reduce the kernel execution times and enable the runtime scheduling, placement and routing. The area cost and the configuration memory size of the new chip architecture are calculated and analyzed. And the efficiency of the OS4RC kernel is evaluated via simulation using three different task models

    MICROELECTRONICS PACKAGING TECHNOLOGY ROADMAPS, ASSEMBLY RELIABILITY, AND PROGNOSTICS

    Get PDF
    This paper reviews the industry roadmaps on commercial-off-the shelf (COTS) microelectronics packaging technologies covering the current trends toward further reducing size and increasing functionality. Due tothe breadth of work being performed in this field, this paper presents only a number of key packaging technologies. The topics for each category were down-selected by reviewing reports of industry roadmaps including the International Technology Roadmap for Semiconductor (ITRS) and by surveying publications of the International Electronics Manufacturing Initiative (iNEMI) and the roadmap of association connecting electronics industry (IPC). The paper also summarizes the findings of numerous articles and websites that allotted to the emerging and trends in microelectronics packaging technologies. A brief discussion was presented on packaging hierarchy from die to package and to system levels. Key elements of reliability for packaging assemblies were presented followed by reliabilty definition from a probablistic failure perspective. An example was present for showing conventional reliability approach using Monte Carlo simulation results for a number of plastic ball grid array (PBGA). The simulation results were compared to experimental thermal cycle test data. Prognostic health monitoring (PHM) methods, a growing field for microelectronics packaging technologies, were briefly discussed. The artificial neural network (ANN), a data-driven PHM, was discussed in details. Finally, it presented inter- and extra-polations using ANN simulation for thermal cycle test data of PBGA and ceramic BGA (CBGA) assemblies

    Design automation and analysis of three-dimensional integrated circuits

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 165-176).This dissertation concerns the design of circuits and systems for an emerging technology known as three-dimensional integration. By stacking individual components, dice, or whole wafers using a high-density electromechanical interconnect, three-dimensional integration can achieve scalability and performance exceeding that of conventional fabrication technologies. There are two main contributions of this thesis. The first is a computer-aided design flow for the digital components of a three-dimensional integrated circuit (3-D IC). This flow primarily consists of two software tools: PR3D, a placement and routing tool for custom 3-D ICs based on standard cells, and 3-D Magic, a tool for designing, editing, and testing physical layout characteristics of 3-D ICs. The second contribution of this thesis is a performance analysis of the digital components of 3-D ICs. We use the above tools to determine the extent to which 3-D integration can improve timing, energy, and thermal performance. In doing so, we verify the estimates of stochastic computational models for 3-D IC interconnects and find that the models predict the optimal 3-D wire length to within 20% accuracy. We expand upon this analysis by examining how 3-D technology factors affect the optimal wire length that can be obtained. Our ultimate analysis extends this work by directly considering timing and energy in 3-D ICs. In all cases we find that significant performance improvements are possible. In contrast, thermal performance is expected to worsen with the use of 3-D integration. We examine precisely how thermal behavior scales in 3-D integration and determine quantitatively how the temperature may be controlled during the circuit placement process. We also show how advanced packaging(cont.) technologies may be leveraged to maintain acceptable die temperatures in 3-D ICs. Finally, we explore two issues for the future of 3-D integration. We determine how technology scaling impacts the effect of 3-D integration on circuit performance. We also consider how to improve the performance of digital components in a mixed-signal 3-D integrated circuit. We conclude with a look towards future 3-D IC design tools.by Shamik Das.Ph.D

    Dynamic and fault tolerant three-dimensional cellular genetic algorithms

    Get PDF
    In the area of artificial intelligence, the development of Evolutionary Algorithms (EAs) has been very active, especially in the last decade. These algorithms started to evolve when scientists from various regions of the world applied the principles of evolution to algorithmic search and problem solving. EAs have been utilised successfully in diverse complex application areas. Their success in tackling hard problems has been the engine of the field of Evolutionary Computation (EC). Nowadays, EAs are considered to be the best solution to use when facing a hard search or optimisation problem. Various improvements are continually being made with the design of new operators, hybrid models, among others. A very important example of such improvements is the use of parallel models of GAs (PGAs). PGAs have received widespread attention from various researchers as they have proved to be more effective than panmictic GAs, especially in terms of efficacy and speedup. This thesis focuses on, and investigates, cellular Genetic Algorithms (cGAs)-a competitive variant of parallel GAs. In a cGA, the tentative solutions evolve in overlapped neighbourhoods, allowing smooth diffusion of the solutions. The benefits derived from using cGAs come not only from flexibility gains and their fitness to the objective target in combination with a robust behaviour but also from their high performance and amenability to implementation using advanced custom silicon chip technologies. Nowadays, cGAs are considered as adaptable concepts for solving problems, especially complex optimisation problems. Due to their structural characteristics, cGAs are able to promote an adequate exploration/exploitation trade-off and thus maintain genetic diversity. Moreover, cGAs are characterised as being massively parallel and easy to implement. The structural characteristics inherited in a cGA provide an active area for investigation. Because of the vital role grid structure plays in determining the effectiveness of the algorithm, cellular dimensionality is the main issue to be investigated here. The implementation of cGAs is commonly carried out on a one- or two-dimensional structure. Studies that investigate higher cellular dimensions are lacking. Accordingly, this research focuses on cGAs that are implemented on a three-dimensional structure. Having a structure with three dimensions, specifically a cubic structure, facilitates faster spreading of solutions due to the shorter radius and denser neighbourhood that result from the vertical expansion of cells. In this thesis, a comparative study of cellular dimensionality is conducted. Simulation results demonstrate higher performance achieved by 3D-cGAs over their 2D-cGAs counterparts. The direct implementation of 3D-cGAs on the new advanced 3D-IC technology will provide added benefits such as higher performance combined with a reduction in interconnection delays, routing length, and power consumption. The maintenance of system reliability and availability is a major concern that must be addressed. A system is likely to fail due to either hard or soft errors. Therefore, detecting a fault before it deteriorates system performance is a crucial issue. Single Event Upsets (SEUs), or soft errors, do not cause permanent damage to system functionality, and can be handled using fault-tolerant techniques. Existing fault-tolerant techniques include hardware or software fault tolerance, or a combination of both. In this thesis, fault-tolerant techniques that mitigate SEUs at the algorithmic level are explored and the inherent abilities of cGAs to deal with these errors are investigated. A fault-tolerant technique and several mitigation techniques are also proposed, and faulty critical data are evaluated critical fault scenarios (stuck at ‘1’ and stuck at ‘0’ faults) are taken into consideration. Chief among several test and real world problems is the problem of determining the attitude of a vehicle using a Global Positioning System (GPS), which is an example of hard real-time application. Results illustrate the ability of cGAs to maintain their functionality and give an adequate performance even with the existence of up to 40% errors in fitness score cells. The final aspect investigated in this thesis is the dynamic characteristic of cGAs. cGAs, and EAs in general, are known to be stochastic search techniques. Hence, adaptive systems are required to continue to perform effectively in a changing environment, particularly when tackling real-world problems. The adaptation in cellular engines is mainly achieved through dynamic balancing between exploration and exploitation. This area has received considerable attention from researchers who focus on improving the algorithmic performance without incurring additional computational effort. The structural properties and the genetic operations provide ways to control selection pressure and, as a result, the exploration/exploitation trade-off. In this thesis, the genetic operations of cGAs, particularly the selection aspect and their influence on the search process, are investigated in order to dynamically control the exploration/exploitation trade-off. Two adaptive-dynamic techniques that use genetic diversity and convergence speeds to guide the search are proposed. Results obtained by evaluating the proposed approaches on a test bench of diverse-characteristic real-world and test problems showed improvement in dynamic cGAs performance over their static counterparts and other dynamic cGAs. For example, the proposed Diversity-Guided 3D-cGA outperformed all the other dynamic cGAs evaluated by obtaining a higher search success rate that reached to 55%

    Automated Design Space Exploration and Datapath Synthesis for Finite Field Arithmetic with Applications to Lightweight Cryptography

    Get PDF
    Today, emerging technologies are reaching astronomical proportions. For example, the Internet of Things has numerous applications and consists of countless different devices using different technologies with different capabilities. But the one invariant is their connectivity. Consequently, secure communications, and cryptographic hardware as a means of providing them, are faced with new challenges. Cryptographic algorithms intended for hardware implementations must be designed with a good trade-off between implementation efficiency and sufficient cryptographic strength. Finite fields are widely used in cryptography. Examples of algorithm design choices related to finite field arithmetic are the field size, which arithmetic operations to use, how to represent the field elements, etc. As there are many parameters to be considered and analyzed, an automation framework is needed. This thesis proposes a framework for automated design, implementation and verification of finite field arithmetic hardware. The underlying motif throughout this work is “math meets hardware”. The automation framework is designed to bring the awareness of underlying mathematical structures to the hardware design flow. It is implemented in GAP, an open source computer algebra system that can work with finite fields and has symbolic computation capabilities. The framework is roughly divided into two phases, the architectural decisions and the automated design genera- tion. The architectural decisions phase supports parameter search and produces a list of candidates. The automated design generation phase is invoked for each candidate, and the generated VHDL files are passed on to conventional synthesis tools. The candidates and their implementation results form the design space, and the framework allows rapid design space exploration in a systematic way. In this thesis, design space exploration is focused on finite field arithmetic. Three distinctive features of the proposed framework are the structure of finite fields, tower field support, and on the fly submodule generation. Each finite field used in the design is represented as both a field and its corresponding vector space. It is easy for a designer to switch between fields and vector spaces, but strict distinction of the two is necessary for hierarchical designs. When an expression is defined over an extension field, the top-level module contains element signals and submodules for arithmetic operations on those signals. The submodules are generated with corresponding vector signals and the arithmetic operations are now performed on the coordinates. For tower fields, the submodules are generated for the subfield operations, and the design is generated in a top-down fashion. The binding of expressions to the appropriate finite fields or vector spaces and a set of customized methods allow the on the fly generation of expressions for implementation of arithmetic operations, and hence submodule generation. In the light of NIST Lightweight Cryptography Project (LWC), this work focuses mainly on small finite fields. The thesis illustrates the impact of hardware implementation results during the design process of WAGE, a Round 2 candidate in the NIST LWC standardization competition. WAGE is a hardware oriented authenticated encryption scheme. The parameter selection for WAGE was aimed at balancing the security and hardware implementation area, using hardware implementation results for many design decisions, for example field size, representation of field elements, etc. In the proposed framework, the components of WAGE are used as an example to illustrate different automation flows and demonstrate the design space exploration on a real-world algorithm

    Hardware Implementations of the WG-16 Stream Cipher with Composite Field Arithmetic

    Get PDF
    The WG stream cipher family consists of stream ciphers based on the Welch-Gong (WG) transformations that are used as a nonlinear filter applied to the output of a linear feedback shift register (LFSR). The aim of this thesis is an exploration of the design space of the WG-16 stream cipher. Five different representations of the field elements were analyzed, namely the polynomial basis representation, the normal basis representation and three isomorphic tower field constructions of F216: F(((22)2)2)2, F(24)4 and F(28)2. Each design option begins with an in-depth description of different field constructions and their impact on the top-level WG transformation circuit. Normal basis representation of elements for each level of the tower was chosen for field constructions F(((22)2)2)2 and F(24)4, and a mixed basis, with polynomial basis for the lower and normal basis for the higher level of the tower for F(28)2. Representation of field elements affects the field arithmetic, which in turn affects the entire design. Targeting high throughput, pipelined architectures were developed, and pipelining was based on the particular field construction: each extension over the prime field offers a new pipelining possibility. Pipelining at a lower level of the tower field reduces the clock period. Most flexible pipelining options are possible for F(((22)2)2)2, a highly regular construction, which permits an algebraic optimization of the WG transformation resulting in two multiplications being removed. High speed, achieved by adequate pipelining granularity, and smaller area due to removed multipliers deem the F(((22)2)2)2 to be the most suitable field construction for the implementation of WG-16. The best WG-16 modules achieve a throughput of 222 Mbit/s with 476 slices used on the Xilinx Spartan-6 FPGA device xc6slx9 (using Xilinx Synthesis Tool (XST) for synthesis and ISE for implementation [47]) and a throughput of 529 Mbit/s with area cost of 12215 GEs for ASIC implementation, using the 65 nm CMOS technology (using Synopsys Design Compiler for synthesis [45] and Cadence SoC Encounter to complete the Place-and-Route phase)
    corecore