324 research outputs found
Programmable built-in self-testing of embedded RAM clusters in system-on-chip architectures
Multiport memories are widely used as embedded cores in all communication system-on-chip devices. Due to their high complexity and very low accessibility, built-in self-test (BIST) is the most common solution implemented to test the different memories embedded in the system. This article presents a programmable BIST architecture based on a single microprogrammable BIST processor and a set of memory wrappers designed to simplify the test of a system containing a large number of distributed multiport memories of different sizes (number of bits, number of words), access protocols (asynchronous, synchronous), and timing
Incorporation of feed-network and circuit modeling into the time-domain finite element analysis of antenna arrays and microwave circuits
In this dissertation, accurate and efficient numerical algorithms are developed to incorporate the feed-network and circuit modeling into the time-domain finite element analysis of antenna arrays and microwave circuits. First, simulation of an antenna system requires accurate modeling of interactions between the radiating elements and the associated feeding network. In this work, a feed network is represented in terms of its scattering matrix in a rational function form in the frequency domain that enables its interfacing with the time-domain finite element modeling of the antenna elements through a fast recursive time-convolution algorithm. The exchange of information between the antenna elements and the feed network occurs through the incident and reflected modal voltages/currents at properly defined port interfaces. The proposed numerical scheme allows a full utilization of the advanced antenna simulation techniques, and significantly extends the current antenna modeling capability to the system level. Second, a hybrid field-circuit solver that combines the capabilities of the time-domain finite element method and a lumped circuit analysis is developed for accurate and efficient characterization of complicated microwave circuits that include both distributive and lumped-circuit components. The distributive portion of the device is modeled by the time-domain finite element method to generate a finite element subsystem, while the lumped circuits are analyzed by a SPICE-like circuit solver to generate a circuit subsystem. A global system for both the finite-element and circuit unknowns is established by combining the two subsystems through coupling matrices to model their interactions. For simulations of even more complicated mixed-scale circuit systems that contain pre-characterized blocks of discrete circuit elements, the hybrid field-circuit analysis implemented a systematic and efficient algorithm to incorporate multiport lumped networks in terms of frequency-dependent admittance matrices. Other advanced features in the hybrid field-circuit solver include application of the tree-cotree splitting algorithm and introduction of a flexible time-stepping scheme. Various numerical examples are presented to validate the implementation and demonstrate the accuracy, efficiency, and applications of the proposed numerical algorithms
Recommended from our members
A study of aspects of synchronisation and communication in certain parallel computer architectures
This paper examines methods for synchronisation and communication between tasks in highly parallel arrays of processors. The development of various methods is researched and simulation techniques are applied to specific structures, to examine their effectiveness. Two approaches to simulation are presented, in the first case a discrete event simulator is applied to task synchronisation implemented with semaphores in a close coupled environment. Secondly the concurrent programming language Occam is used to simulate a systolic configuration of processors. In this case the design is verified, through actual system construction.
Conclusions are drawn regarding the design disciplines and structure imposed by the use of these simulation techniques. A close relationship is found between the behaviour of a simulation written in Occam and the same structure constructed from multiple processors.
Further research is suggested into the subject of dataflow processors, to find suitable means for simulating such systems, prior to implementation. A type of test vehicle is proposed that would operate a dataflow processor under the control of the development system
NEPP Update of Independent Single Event Upset Field Programmable Gate Array Testing
This presentation provides a NASA Electronic Parts and Packaging (NEPP) Program update of independent Single Event Upset (SEU) Field Programmable Gate Array (FPGA) testing including FPGA test guidelines, Microsemi RTG4 heavy-ion results, Xilinx Kintex-UltraScale heavy-ion results, Xilinx UltraScale+ single event effect (SEE) test plans, development of a new methodology for characterizing SEU system response, and NEPP involvement with FPGA security and trust
Addressing Fiber-to-Chip Coupling Issues in Silicon Photonics
Esta tesis trata de resolver el problema de la
interconexión (acoplo) entre un circuito integrado fotónico de silicio (chip) y el
mundo exterior, es decir una fibra óptica. Se trata de uno de los temas más importantes
a los que hoy en día se enfrenta la comunidad científica en óptica integrada
de silicio. A pesar de que pueden realizarse circuitos integrados fotónicos de silicio
de muy alta calidad utilizando herramientas estándar de fabricación CMOS,
la interfaz con la fibra óptica sigue siendo la fuente más importante de pérdidas,
debido a la gran diferencia en el tamaño entre los modos de propagación de la
fibra y de las guías de los circuitos integrados fotónicos. Abordar el problema es,
por lo tanto, muy importante para poder utilizar los circuitos integrados fotónicos
de silicio en una aplicación práctica.
Objetivos: El propósito de este trabajo es hacer frente a este problema en la
interfaz del acoplamiento fibra-chip, con énfasis en el ensamblado o empaquetado
final. Por lo tanto, los objetivos principales son: 1) estudio, modelado y optimización de diseños de diferentes técnicas eficientes de acoplamiento entre fibras
ópticas y circuitos integrados fotónicos de silicio, 2) fabricación y demostración
experimental de los diseños obtenidos, 3) ensamblado y empaquetado de algunos
de los prototipos de acoplamiento fabricados.
Metodología: Este trabajo se desarrolla a lo largo de dos líneas de investigación, en conformidad con las dos principales estrategias de acoplamiento que
pueden encontrarse en la literatura, concretamente, estructuras de acoplamiento
tipo "grating" (la fibra acopla verticalmente sobre la superficie de circuito), y
estructuras del tipo ¿inverted taper¿ (la fibra acopla horizontalmente por el extremo
de circuito). Resultados: tanto en el caso de estructuras tipo "grating" como en el caso
de estructuras "inverted taper", son importantes los avances conseguidos sobre el
estado del arte. En lo que respecta al "grating", se ha demostrado dos tipos de
estructuras. Por un lado, se ha demostrado "gratings" adecuados para acoplo a
guías de silicio convencionales. Por otra parte, se ha demostrado por primera vez
el funcionamiento de "gratings" para guías de silicio tipo "slot" horizontal, que
son un tipo de guía muy prometedora para aplicaciones de óptica no lineal. En
relación con el acoplamiento a través de "inverted taper", se ha demostrado una
estructura novedosa basada en este tipo de acoplamiento. Con esta estructura,
importantes son los avances conseguidos en el empaquetado de fibras ópticas con
el circuito de silicio. Su innovadora integración con estructuras de tipo "V-groove"
se presenta como un medio para alinear pasivamente conjuntos de múltiples fibras
a un mismo circuito integrado fotónico. También, se estudia el empaquetado de
conjuntos de múltiples fibras usando acopladores tipo "grating", resultando en
un prototipo de empaquetado de reducido tamaño.Galán Conejos, JV. (2010). Addressing Fiber-to-Chip Coupling Issues in Silicon Photonics [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/9196Palanci
Pyramic array: An FPGA based platform for many-channel audio acquisition
Array processing of audio data has many interesting applications: acoustic beamforming, source separation, indoor localization, room geometry estimation, etc. Recent advances in MEMS has produced tiny microphones, analog or even with digital converter integrated. This opens the door to create arrays with a massive number of microphones. We dub such an array many-channel by analogy to many-core processors.Microphone arrays techniques present compelling applications for robotic implementations. Those techniques can allow robots to listen to their environment and infer clues from it. Such features might enable capabilities such as natural interaction with humans, interpreting spoken commands or the localization of victims during search and rescue tasks. However, under noisy conditions robotic implementations of microphone arrays might degrade their precision when localizing sound sources. For practical applications, human hearing still leaves behind microphone arrays. Daniel Kisch is an example of how humans are able to efficiently perform echo-localization to recognize their environment, even in noisy and reverberant environments. For ubiquitous computing, another limitation of acoustic localization algorithms is within their capabilities of performing real-time Digital Signal Processing (DSP) operations. To tackle those problems, tradeoffs between size, weight, cost and power consumption compromise the design of acoustic sensors for practical applications. This work presents the design and operation of a large microphone array for DSP applications in realistic environments. To address those problems this project introduces the Pyramic sound capture system designed at LAP in EPFL. Pyramic is a custom hardware which possesses 48 microphones dis- tributed in the edges of a tetrahedron. The microphone arrays interact with a Terasic DE1-SoC board from Altera Cyclone V family devices, which combines a Hard Processor System (HPS) and a Field Programmable Gate Array (FPGA) in the same die. The HPS part integrates a dual- core ARM-based Cortex-A9 processor, which combined with the power of FPGA design suitable for processing multichannel microphone signals. This thesis explains the implementation of the Pyramic array. Moreover, FPGA-based hardware accelerators have been designed to imple- ment a Master SPI communication with the array and a parallel 48 channels FIR filters cascade of the audio data for delay-and-sum beamforming applications. Additionally, the configura- tion of the HPS part allows the Pyramic array to be controlled through a Linux based OS. The main purpose of the project is to develop a flexible platform in which real-time echo-location algorithms can be implemented. The effectiveness of the Pyramic array design is illustrated by testing the recorded data with offline direction of arrival algorithms developed at LCAV in EPFL
Towards Energy-Efficient and Reliable Computing: From Highly-Scaled CMOS Devices to Resistive Memories
The continuous increase in transistor density based on Moore\u27s Law has led us to highly scaled Complementary Metal-Oxide Semiconductor (CMOS) technologies. These transistor-based process technologies offer improved density as well as a reduction in nominal supply voltage. An analysis regarding different aspects of 45nm and 15nm technologies, such as power consumption and cell area to compare these two technologies is proposed on an IEEE 754 Single Precision Floating-Point Unit implementation. Based on the results, using the 15nm technology offers 4-times less energy and 3-fold smaller footprint. New challenges also arise, such as relative proportion of leakage power in standby mode that can be addressed by post-CMOS technologies. Spin-Transfer Torque Random Access Memory (STT-MRAM) has been explored as a post-CMOS technology for embedded and data storage applications seeking non-volatility, near-zero standby energy, and high density. Towards attaining these objectives for practical implementations, various techniques to mitigate the specific reliability challenges associated with STT-MRAM elements are surveyed, classified, and assessed herein. Cost and suitability metrics assessed include the area of nanomagmetic and CMOS components per bit, access time and complexity, Sense Margin (SM), and energy or power consumption costs versus resiliency benefits. In an attempt to further improve the Process Variation (PV) immunity of the Sense Amplifiers (SAs), a new SA has been introduced called Adaptive Sense Amplifier (ASA). ASA can benefit from low Bit Error Rate (BER) and low Energy Delay Product (EDP) by combining the properties of two of the commonly used SAs, Pre-Charge Sense Amplifier (PCSA) and Separated Pre-Charge Sense Amplifier (SPCSA). ASA can operate in either PCSA or SPCSA mode based on the requirements of the circuit such as energy efficiency or reliability. Then, ASA is utilized to propose a novel approach to actually leverage the PV in Non-Volatile Memory (NVM) arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time
A Dynamically Reconfigurable Parallel Processing Framework with Application to High-Performance Video Processing
Digital video processing demands have and will continue to grow at unprecedented rates. Growth comes from ever increasing volume of data, demand for higher resolution, higher frame rates, and the need for high capacity communications. Moreover, economic realities force continued reductions in size, weight and power requirements. The ever-changing needs and complexities associated with effective video processing systems leads to the consideration of dynamically reconfigurable systems. The goal of this dissertation research was to develop and demonstrate the viability of integrated parallel processing system that effectively and efficiently apply pre-optimized hardware cores for processing video streamed data. Digital video is decomposed into packets which are then distributed over a group of parallel video processing cores. Real time processing requires an effective task scheduler that distributes video packets efficiently to any of the reconfigurable distributed processing nodes across the framework, with the nodes running on FPGA reconfigurable logic in an inherently Virtual\u27 mode. The developed framework, coupled with the use of hardware techniques for dynamic processing optimization achieves an optimal cost/power/performance realization for video processing applications. The system is evaluated by testing processor utilization relative to I/O bandwidth and algorithm latency using a separable 2-D FIR filtering system, and a dynamic pixel processor. For these applications, the system can achieve performance of hundreds of 640x480 video frames per second across an eight lane Gen I PCIe bus. Overall, optimal performance is achieved in the sense that video data is processed at the maximum possible rate that can be streamed through the processing cores. This performance, coupled with inherent ability to dynamically add new algorithms to the described dynamically reconfigurable distributed processing framework, creates new opportunities for realizable and economic hardware virtualization.\u2
A Deep Learning Framework for Hydrogen-fueled Turbulent Combustion Simulation
The high cost of high-resolution computational fluid/flame dynamics (CFD) has
hindered its application in combustion related design, research and
optimization. In this study, we propose a new framework for turbulent
combustion simulation based on the deep learning approach. An optimized deep
convolutional neural network (CNN) inspired from a U-Net architecture and
inception module is designed for constructing the framework of the deep
learning solver, named CFDNN. CFDNN is then trained on the simulation results
of hydrogen combustion in a cavity with different inlet velocities. After
training, CFDNN can not only accurately predict the flow and combustion fields
within the range of the training set, but also shows an extrapolation ability
for prediction outside the training set. The results from CFDNN solver show
excellent consistency with the conventional CFD results in terms of both
predicted spatial distributions and temporal dynamics. Meanwhile, two orders of
magnitude of acceleration is achieved by using CFDNN solver compared to the
conventional CFD solver. The successful development of such a deep
learning-based solver opens up new possibilities of low-cost, high-accuracy
simulations, fast prototyping, design optimization and real-time control of
combustion systems such as gas turbines and scramjets
- …