173 research outputs found
Case Study: First-Time Success ASIC Design Methodology Applied to a Multi-Processor System-on-Chip
Achieving first-time success is crucial in the ASIC design league considering the soaring cost, tight time-to-market window, and competitive business environment. One key factor in ensuring first-time success is a well-defined ASIC design methodology. Here we propose a novel ASIC design methodology that has been proven for the RUMPS401Â (Rahman University Multi-Processor System 401) Multiprocessor System-on-Chip (MPSoC) project. The MPSoC project is initiated by Universiti Tunku Abdul Rahman (UTAR) VLSI design center. The proposed methodology includes the use of Universal Verification Methodology (UVM). The use of electronic design automation (EDA) software during each step of the design methodology is also presented. The first-time success RUMPS401 demonstrates the use of the proposed ASIC design methodology and the good of using one. Especially this project is carried on in educational environment that is even more limited in budget, resources and know-how, compared to the business and industrial counterparts. Here a novel ASIC design methodology that is tailored to first-time success MPSoC is presented
Recommended from our members
EDA design for Microscale Modular Assembled ASIC (M2A2) circuits
As the semiconductor industry has driven down the minimum feature size to well below 50nm, the mask cost to make devices has skyrocketed. The cost for a full set of masks is estimated to be about 2M for 65nm lithography nodes. According to some estimates, mask writing time goes up as a power of five as feature sizes are decreased below 50nm. In addition, higher complexity of large designs increases the number of design re-spins. The above two factors lead to considerable increase in the nonrecurring engineering cost (NRE) for standard cell ASICs, which has become prohibitively expensive for low to mid volume applications. Field programmable gate array (FPGAs) offer an acceptable solution for fast prototyping and ultra-low volume applications, but are generally not seen as a replacement for ASICs because of their highly inefficient space utilization, lower performance/speed and high power consumption. This is particularly the case as mobility has driven expectations for small form factor and low power consumption. In this work, a new type of ASICs named as Microscale Modular Assembled ASIC (M2A2) is proposed. This technology is a novel application of the high-speed, precision assembly technique for fabrication of ASICs using a limited number of mass-produced feedstock logic circuits. The idea is to share the mask cost for sub-100nm feature sizes across a large number of ASIC designs, decreasing the NRE for individual designs. The concept of constructing ASICs using repeating logic elements is based on previous works where it has been shown that ASICs made of via/metal configured structured elements can achieve space utilization and performance comparable to cell based ASICs. However, in the proposed technique, we provide significantly more choice in the transistor layer, in terms of feedstock types and their configuration. This thesis document deals with the electronic design automation (EDA) design for microscale modular assembled ASIC based circuits. The document discusses the design of feedstock cells, generation of feedstock preplaced design, generation of design collaterals to support M2A2 EDA flow, and front end M2A2 synthesis flow to meet the required functionality of design and achieve optimal quality of results (QoR) metrics in terms of circuit performance/speed, power and areaElectrical and Computer Engineerin
Recommended from our members
Physical design and verification for Microscale Modular Assembled ASIC (M2A2) circuits
The overall goal of this project is to bring down the fabrication cost for low volume ASICs by introducing a novel 'pick and place' mechanism for micro-scale elements of ASICs referred to here as feedstock. This new feedstock based ASIC design flow is referred as Microscale Modular Assembled ASIC (M2A2) design flow. This report complements efforts in fabrication and other Electronic Design Automation (EDA) aspects carried out by researchers at The University of Texas at Austin studying this new mechanism for ASIC design and manufacture. For the purpose of this study, the conventional industrial practice in ASIC design flow was analyzed and modifications to that flow were explored. The initial Synthesis solution was developed using Synopsys's Design Compiler (DC) tool. However, due to the limitations of the tool, the final solution was developed based on Cadence tools. The main blocks of the design flow in this report are Synthesis and analysis of its capabilities; Conformal ECO; Post-Mask spare cell mapping; Post-Mask Clock Tree Synthesis (CTS) and Route; Post-Mask timing and Design Rule Violation (DRV) fixing; and Verification. The Standard Cell-based ASIC design was used as a benchmark and it was compared to M2A2 design flowElectrical and Computer Engineerin
Using Fine Grain Approaches for highly reliable Design of FPGA-based Systems in Space
Nowadays using SRAM based FPGAs in space missions is increasingly considered due to their flexibility and reprogrammability. A challenge is the devices sensitivity to radiation effects that increased with modern architectures due to smaller CMOS structures. This work proposes fault tolerance methodologies, that are based on a fine grain view to modern reconfigurable architectures. The focus is on SEU mitigation challenges in SRAM based FPGAs which can result in crucial situations
Dependable Embedded Systems
This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems
Regular Datapaths on Field-Programmable Gate Arrays
Field-Programmable Gate Arrays (FPGAs) are a recent kind of programmable logic device. They allow the implementation of integrated digital electronic circuits without requiring the complex optical, chemical and mechanical processes used in a conventional chip fabrication. FPGAs can be embedded in traditional system designflows to perform prototyping and emulation tasks. In addition, they also enable novel applications such as configurable computers with hardware dynamically adaptable to a specific problem. The growing chip capacity now allows even the implementation of CPUs and DSPs on single FPGAs. However, current design automation tools trace their roots to times of very limited FPGA sizes, and are primarily optimized for the implementation of random glue logic. The wide datapaths common to CPUs and DSPs are only processed with reduced performance. This thesis presents Structured Design Implementation (SDI), a suite of specialized tools coordinated by a common strategy, which aims to efficiently map even larger regular datapaths to FPGAs. In all steps, regularity is preserved whenever possible, or restored after disruptive operations were required. The circuits are composed from parametrizable modules providing a variety of logical, arithmetical and storage functions. For each module, multiple target FPGA-specific implementation alternatives may be generated in both gatelevel netlist and layout views. A floorplanner based on a genetic algorithm is then used to simultaneously choose an actual implementation from the set of alternatives for each module, and to arrange the selected module implementations in a linear placement. The floorplanning operation optimizes for short routing delays, high routability, and fit into the target FPGA.Field-Programmable Gate-Arrays (FPGAs) sind eine noch junge Art von programmierbaren Logikbausteinen. Sie erlauben die Implementierung von integrierten Digitalschaltungen ohne die komplizierten optischen, chemischen und mechanischen Prozesse, die normalerweise für die Chipfertigung erforderlich sind. FPGAs können im Rahmen konventioneller Entwurfsmethoden zu Emulationszwecken und Prototyp-Aufbauten herangezogen werden. Sie erlauben aber auch völlig neue Anwendungen wie rekonfigurierbare Computer, deren Hardware dynamisch an ein spezielles Problem angepaßt werden kann. Die gewachsene Chip-Kapazität erlaubt nun sogar die Implementierung von CPUs und digitalen Signalprozessoren (DSPs) auf einem einzelnen FPGA. Die Leistungsfähigkeit der entstandenen Schaltungen wird jedoch durch die zur Zeit erhältlichen CAD-Werkzeuge limitiert, da diese noch auf stark beschränkte FPGA-Größen ausgerichtet sind und primär der platzsparenden Verarbeitung unregelmäßiger Logik dienen. Die breiten Datenpfade in Bit-Slice-Struktur, die den Kern vieler CPUs und DSPs darstellen, werden nur suboptimal behandelt. Diese Arbeit stellt Structured Design Implementation (SDI) vor, ein System von spezialisierten CAD-Werkzeugen, die auch größere reguläre Datenpfade effizient auf FPGAs abbilden. In allen Verarbeitungsschritten wird dabei die bestehende Regularität soweit wie möglich erhalten oder nach regularitätsvernichtenden Operationen wiederhergestellt. Zur Schaltungseingabe steht eine Bibliothek von allgemeinen Modulen aus den Bereichen Logik, Arithmetik und Speicherung bereit. Diese können durch Belegung verschiedener Parameter wie Bit-Breiten und Datentypen an aktuelle Anforderungen angepaßt werden
Fault-tolerant satellite computing with modern semiconductors
Miniaturized satellites enable a variety space missions which were in the past infeasible, impractical or uneconomical with traditionally-designed heavier spacecraft. Especially CubeSats can be launched and manufactured rapidly at low cost from commercial components, even in academic environments. However, due to their low reliability and brief lifetime, they are usually not considered suitable for life- and safety-critical services, complex multi-phased solar-system-exploration missions, and missions with a longer duration. Commercial electronics are key to satellite miniaturization, but also responsible for their low reliability: Until 2019, there existed no reliable or fault-tolerant computer architectures suitable for very small satellites. To overcome this deficit, a novel on-board-computer architecture is described in this thesis.Robustness is assured without resorting to radiation hardening, but through software measures implemented within a robust-by-design multiprocessor-system-on-chip. This fault-tolerant architecture is component-wise simple and can dynamically adapt to changing performance requirements throughout a mission. It can support graceful aging by exploiting FPGA-reconfiguration and mixed-criticality. Experimentally, we achieve 1.94W power consumption at 300Mhz with a Xilinx Kintex Ultrascale+ proof-of-concept, which is well within the powerbudget range of current 2U CubeSats. To our knowledge, this is the first COTS-based, reproducible on-board-computer architecture that can offer strong fault coverage even for small CubeSats.European Space AgencyComputer Systems, Imagery and Medi
High performance scientific computing in applications with direct finite element simulation
xiii, 133 p.La predicción del flujo separado, incluida la pérdida de un avión completo mediantela dinámica de fluidos computacional (CFD) se considera uno de los grandes desaf¿¿os que seresolverán en 2030, según NASA. Las ecuaciones no lineales de Navier-Stokes proporcionan laformulación matemática para flujo de fluidos en espacios tridimensionales. Sin embargo, todaviafaltan soluciones clásicas, existencia y singularidad. Ya que el cálculo de la fuerza bruta esintratable para realizar simulación predictiva para un avión completo, uno puede usar la simulaciónnumérica directa (DNS); sin embargo, prohibitivamente caro ya que necesita resolver laturbulencia a escala de magnitud Re power (9/4). Considerando otros métodos como el estad¿¿sticopromedio Reynolds¿s Average Navier Stokes (RANS), spatial average Large Eddy Simulation(LES), y Hybrid Detached Eddy Simulation (DES), que requieren menos cantidad de grados delibertad. Todos estos métodos deben ajustarse a los problemas de referencia y, además, cerca las paredes, la malla tieneque ser muy fina para resolver las capas l¿¿mite (lo cual significa que el costo computacional es muycostoso). Por encima de todo, los resultados son sensibles a, por ejemplo, parámetros expl¿¿citos enel método, la malla, etc.Como una solución al desaf¿¿o, aqu¿¿ presentamos la adaptación Metodolog¿¿a de solución directa deFEM (DFS) con resolución numérica disparo, como una familia predictiva, libre de parámetros demétodos para flujo turbulento. Resolvimos el modelo de avión JAXA Standard Model (JSM) ennúmero realista de Reynolds, presentado como parte del High Lift Taller de predicción 3.Predijimos un aumento de Cl dentro de un error de 5 % vs experimento, arrastre Cd dentro de 10 %error y detenga 1 ¿ dentro del ángulo de ataque.El taller identificó un probable experimento error depedido 10 % para los resultados de arrastre. La simulación es 10 veces más rápido y más barato encomparación con CFD tradicional o existente enfoques. La eficiencia proviene principalmente dell¿¿mite de deslizamiento condición que permite mallas gruesas cerca de las paredes, orientada aobjetivos control de error adaptativo que refina la malla solo donde es necesario y grandes pasos detiempo utilizando un método de iteración de punto fijo tipo Schur, sin comprometer la precisión delos resultados de la simulación.También presentamos una generalización de DFS a densidad variable y validado contra el problemade referencia MARIN bien establecido. los Los resultados muestran un buen acuerdo con losresultados experimentales en forma de sensores de presión. Más tarde, usamos esta metodolog¿¿apara resolver dos aplicaciones en problemas de flujo multifásico. Uno tiene que ver con un flashtanque de almacenamiento de agua de lluvia (consorcio de agua de Bilbao), y el segundo es sobre eldiseño de una boquilla para impresión 3D. En el agua de lluvia tanque de almacenamiento,predijimos que la altura del agua en el tanque tiene un influencia significativa sobre cómo secomporta el flujo aguas abajo de la puerta del tanque (válvula). Para la impresión 3D,desarrollamos un diseño eficiente con El flujo de chorro enfocado para evitar la oxidación y elcalentamiento en la punta del boquilla durante un proceso de fusión.Finalmente, presentamos aqu¿¿ el paralelismo en múltiples GPU y el incrustado sistema dearquitectura Kalray. Casi todas las supercomputadoras de hoy tienen arquitecturas heterogéneas,1 See the UNESCO Internacional Standard nomenclature for fields of Science and Technologyacomo CPU+GPU u otros aceleradores, y, por lo tanto, es esencial desarrollar marcoscomputacionales para aprovecha de ellos. Como lo hemos visto antes, se comienza a desarrollar eseCFD más tarde en la década de 1060 cuando podemos tener poder computacional, por lo tanto, Esesencial utilizar y probar estos aceleradores para los cálculos de CFD. Las GPU tienen unaarquitectura diferente en comparación con las CPU tradicionales. Técnicamente, la GPU tienemuchos núcleos en comparación con las CPU que hacen de la GPU una buena opción para elcómputo paralelo.Para múltiples GPU, desarrollamos un cálculo de plantilla, aplicado a simulación depliegues geológicos. Exploramos la computación de halo y utilizamos Secuencias CUDA paraoptimizar el tiempo de computación y comunicación. La ganancia de rendimiento resultante fue de23 % para cuatro GPU con arquitectura Fermi, y la mejora correspondiente obtenida en cuatro LasGPU Kepler fueron de 47 %.This research was carried out at the Basque Center for Applied Mathematics (BCAM) within the CFD Computational Technology (CFDCT) and also at the School of Electrical Engineering and Computer Science(Royal Institue of Technology, Stockholm, Sweden). Which is suported by Fundacion Obra Social “la Caixa“, Severo Ochoa Excellence research centre 2014-2018 SEV-2013-0323, Severo Ochoa Excellence research centre 2018-2022 SEV-2017-0718, BERC program 2014-2017, BERC program 2018-2021, MSO4SC European project, Elkartek. This work has been performed using the computing infrastructure from SNIC (Swedish National Infrastructure for Computing)
- …