55 research outputs found

    IAPSA 2 small-scale system specification

    Get PDF
    The details of a hardware implementation of a representative small scale flight critical system is described using Advanced Information Processing System (AIPS) building block components and simulated sensor/actuator interfaces. The system was used to study application performance and reliability issues during both normal and faulted operation

    A Multi-layer Fpga Framework Supporting Autonomous Runtime Partial Reconfiguration

    Get PDF
    Partial reconfiguration is a unique capability provided by several Field Programmable Gate Array (FPGA) vendors recently, which involves altering part of the programmed design within an SRAM-based FPGA at run-time. In this dissertation, a Multilayer Runtime Reconfiguration Architecture (MRRA) is developed, evaluated, and refined for Autonomous Runtime Partial Reconfiguration of FPGA devices. Under the proposed MRRA paradigm, FPGA configurations can be manipulated at runtime using on-chip resources. Operations are partitioned into Logic, Translation, and Reconfiguration layers along with a standardized set of Application Programming Interfaces (APIs). At each level, resource details are encapsulated and managed for efficiency and portability during operation. An MRRA mapping theory is developed to link the general logic function and area allocation information to the device related physical configuration level data by using mathematical data structure and physical constraints. In certain scenarios, configuration bit stream data can be read and modified directly for fast operations, relying on the use of similar logic functions and common interconnection resources for communication. A corresponding logic control flow is also developed to make the entire process autonomous. Several prototype MRRA systems are developed on a Xilinx Virtex II Pro platform. The Virtex II Pro on-chip PowerPC core and block RAM are employed to manage control operations while multiple physical interfaces establish and supplement autonomous reconfiguration capabilities. Area, speed and power optimization techniques are developed based on the developed Xilinx prototype. Evaluations and analysis of these prototype and techniques are performed on a number of benchmark and hashing algorithm case studies. The results indicate that based on a variety of test benches, up to 70% reduction in the resource utilization, up to 50% improvement in power consumption, and up to 10 times increase in run-time performance are achieved using the developed architecture and approaches compared with Xilinx baseline reconfiguration flow. Finally, a Genetic Algorithm (GA) for a FPGA fault tolerance case study is evaluated as a ultimate high-level application running on this architecture. It demonstrated that this is a hardware and software infrastructure that enables an FPGA to dynamically reconfigure itself efficiently under the control of a soft microprocessor core that is instantiated within the FPGA fabric. Such a system contributes to the observed benefits of intelligent control, fast reconfiguration, and low overhead

    Reducing redundancy of real time computer graphics in mobile systems

    Get PDF
    The goal of this thesis is to propose novel and effective techniques to eliminate redundant computations that waste energy and are performed in real-time computer graphics applications, with special focus on mobile GPU micro-architecture. Improving the energy-efficiency of CPU/GPU systems is not only key to enlarge their battery life, but also allows to increase their performance because, to avoid overheating above thermal limits, SoCs tend to be throttled when the load is high for a large period of time. Prior studies pointed out that the CPU and especially the GPU are the principal energy consumers in the graphics subsystem, being the off-chip main memory accesses and the processors inside the GPU the primary energy consumers of the graphics subsystem. First, we focus on reducing redundant fragment processing computations by means of improving the culling of hidden surfaces. During real-time graphics rendering, objects are processed by the GPU in the order they are submitted by the CPU, and occluded surfaces are often processed even though they will end up not being part of the final image. When the GPU realizes that an object or part of it is not going to be visible, all activity required to compute its color and store it has already been performed. We propose a novel architectural technique for mobile GPUs, Visibility Rendering Order (VRO), which reorders objects front-to-back entirely in hardware to maximize the culling effectiveness of the GPU and minimize overshading, hence reducing execution time and energy consumption. VRO exploits the fact that the objects in graphics animated applications tend to keep its relative depth order across consecutive frames (temporal coherence) to provide the feeling of smooth transition. VRO keeps visibility information of a frame, and uses it to reorder the objects of the following frame. VRO just requires adding a small hardware to capture the visibility information and use it later to guide the rendering of the following frame. Moreover, VRO works in parallel with the graphics pipeline, so negligible performance overheads are incurred. We illustrate the benefits of VRO using various unmodified commercial 3D applications for which VRO achieves 27% speed-up and 14.8% energy reduction on average. Then, we focus on avoiding redundant computations related to CPU Collision Detection (CD). Graphics applications such as 3D games represent a large percentage of downloaded applications for mobile devices and the trend is towards more complex and realistic scenes with accurate 3D physics simulations. CD is one of the most important algorithms in any physics kernel since it identifies the contact points between the objects of a scene and determines when they collide. However, real-time accurate CD is very expensive in terms of energy consumption. We propose Render Based Collision Detection (RBCD), a novel energy-efficient high-fidelity CD scheme that leverages some intermediate results of the rendering pipeline to perform CD, so that redundant tasks are done just once. Comparing RBCD with a conventional CD completely executed in the CPU, we show that its execution time is reduced by almost three orders of magnitude (600x speedup), because most of the CD task of our model comes for free by reusing the image rendering intermediate results. Although not necessarily, such a dramatic time improvement may result in better frames per second if physics simulation stays in the critical path. However, the most important advantage of our technique is the enormous energy savings that result from eliminating a long and costly CPU computation and converting it into a few simple operations executed by a specialized hardware within the GPU. Our results show that the energy consumed by CD is reduced on average by a factor of 448x (i.e., by 99.8\%). These dramatic benefits are accompanied by a higher fidelity CD analysis (i.e., with finer granularity), which improves the quality and realism of the application.El objetivo de esta tesis es proponer técnicas efectivas y originales para eliminar computaciones inútiles que aparecen en aplicaciones gráficas, con especial énfasis en micro-arquitectura de GPUs. Mejorar la eficiencia energética de los sistemas CPU/GPU no es solo clave para alargar la vida de la batería, sino también incrementar su rendimiento. Estudios previos han apuntado que la CPU y especialmente la GPU son los principales consumidores de energía en el sub-sistema gráfico, siendo los accesos a memoria off-chip y los procesadores dentro de la GPU los principales consumidores de energía del sub-sistema gráfico. Primero, nos hemos centrado en reducir computaciones redundantes de la fase de fragment processing mediante la mejora en la eliminación de superficies ocultas. Durante el renderizado de gráficos en tiempo real, los objetos son procesados por la GPU en el orden en el que son enviados por la CPU, y las superficies ocultas son a menudo procesadas incluso si no no acaban formando parte de la imagen final. Cuando la GPU averigua que el objeto o parte de él no es visible, toda la actividad requerida para computar su color y guardarlo ha sido realizada. Proponemos una técnica arquitectónica original para GPUs móviles, Visibility Rendering Order (VRO), la cual reordena los objetos de delante hacia atrás por completo en hardware para maximizar la efectividad del culling de la GPU y así minimizar el overshading, y por lo tanto reducir el tiempo de ejecución y el consumo de energía. VRO explota el hecho de que los objetos de las aplicaciones gráficas animadas tienden a mantener su orden relativo en profundidad a través de frames consecutivos (coherencia temporal) para proveer animaciones con transiciones suaves. Dado que las relaciones de orden en profundidad entre objetos son testeadas en la GPU, VRO introduce costes mínimos en energía. Solo requiere añadir una pequeña unidad hardware para capturar la información de visibilidad. Además, VRO trabaja en paralelo con el pipeline gráfico, por lo que introduce costes insignificantes en tiempo. Ilustramos los beneficios de VRO usango varias aplicaciones 3D comerciales para las cuales VRO consigue un 27% de speed-up y un 14.8% de reducción de energía en media. En segundo lugar, evitamos computaciones redundantes relacionadas con la Detección de Colisiones (CD) en la CPU. Las aplicaciones gráficas animadas como los juegos 3D representan un alto porcentaje de las aplicaciones descargadas en dispositivos móviles y la tendencia es hacia escenas más complejas y realistas con simulaciones físicas 3D precisas. La CD es uno de los algoritmos más importantes entre los kernel de físicas dado que identifica los puntos de contacto entre los objetos de una escena. Sin embargo, una CD en tiempo real y precisa es muy costosa en términos de consumo energético. Proponemos Render Based Collision Detection (RBCD), una técnica energéticamente eficiente y preciso de CD que utiliza resultados intermedios del rendering pipeline para realizar la CD. Comparando RBCD con una CD convencional completamente ejecutada en la CPU, mostramos que el tiempo de ejecución es reducido casi tres órdenes de magnitud (600x speedup), porque la mayoría de la CD de nuestro modelo reusa resultados intermedios del renderizado de la imagen. Aunque no es así necesariamente, esta espectacular en tiempo puede resultar en mejores frames por segundo si la simulación de físicas está en el camino crítico. Sin embargo, la ventaja más importante de nuestra técnica es el enorme ahorro de energía que resulta de eliminar las largas y costosas computaciones en la CPU, sustituyéndolas por unas pocas operaciones ejecutadas en un hardware especializado dentro de la GPU. Nuestros resultados muestran que la energía consumida por la CD es reducidad en media por un factor de 448x. Estos dramáticos beneficios vienen acompañados de una mayor fidelidad en la CD (i.e. con granularidad más fina)Postprint (published version

    An investigation to determine the producibility of a 3-D braider and bias direction weaving loom

    Get PDF
    The development of prototype machines for the production of generalized braid patterns is described. Mechanical operating principles and control strategies are presented for two prototype machines which were fabricated and evaluated. Both machines represent advances over current techniques for forming composite material preforms by enabling near ideal control of fiber orientation. Furthermore, they overcome both the lack of general control of produced fiber architectures and the complexity of other weaving processes that were produced for the same purpose. One prototype, the modified Farley braider, consists of an array of turntables which can be rotated 90 degrees and returned; hence, they can form tracks in the x and y axis. Yarn ends are transported about the surface formed by the turntables using motorized tractors. These tractors are controlled using an optical link with a control circuit and host computer. The tractors are powered through electrical contact with the turntables. The necessary relative motions are produced by a series of linear tractor moves combined with a sequence of turntable rotations. The movement of the tractors about the surface causes the yarns to produce the desired braiding pattern. The second device, the shuttle plate braider, consists of a braiding surface formed by an array of square elements, each separated from its neighbor by a gap. Beneath this surface lies a shuttle plate, which reciprocates first in one axis and then in the other. As this movement takes place, yarn carrying shuttles engage and disengage the plate by means of solenoid activated pins. By selective engagement and disengagement, the shuttles can move the yarn ends in any desired pattern, forming the desired braid. Control power, and control signals, are transmitted from the electronic interface circuit and host computer, via the braiding surface through electrical contact with the shuttles. Motive power is proved to the shuttles by motion of the shuttle plate, which is passively driven using pneumatic rams. Each shuttle is a simple device that uses only a solenoid to engage the plate and is a simple device that uses only a solenoid to engage the plate and is independently controllable. When compared with each other, the modified Farley braider has the advantage of speed, and the shuttle plate braider the advantage of mechanical control and simplicity

    Implementação de Tx/Rx banda base para 802.11-2007 em FPGA

    Get PDF
    Mestrado em Engenharia Electrónica e TelecomunicaçõesO trabalho apresentado nesta dissertação teve como objectivo o desenvolvimento da camada física de um sistema de transmissão e recepção de sinais OFDM baseados no standard IEEE 802.11-2007. O sistema desenvolvido inclui geração de dados aleatórios, modulador QAM, inserção de pilotos e subportadora DC, IFFT com adição de Prefixo Cíclico, buffer de saída e o consequente oposto para o receptor. A dissertação encontra-se dividida em duas partes principais. Na primeira parte, o sistema foi projectado e simulado em Matlab através do ambiente Simulink com o auxílio dos blocos da Xilinx inseridos no seu software System Generator for DSP. Na segunda parte, foram adicionadas DACʼs ao transmissor e o próprio foi compilado para um bloco e testado no XtremeDSP Development Kit-IV da Nallatech que inclui uma Field-Programmable Gate Array. Todos os módulos foram desenhados usando os blocos do System Generator for DSP da Xilinx. O kit está conectado ao computador através de uma interface PCI. Os dados obtidos são exibidos em Matlab para a primeira parte e num osciloscópio para a segunda parte.It was the objective of this dissertation the development of the Physical Layer of an IEEE 802.11-2007 Transmitter-Receiver system for generating OFDM signals. The developed design includes random Data Generation, QAM Modulator, Pilots and DC subcarrier insertion, IFFT with Cyclic Prefix insertion, an Output Buffer and the subsequent opposite for its receiver. This dissertation was divided in two main segments. In the first segment, the system was designed and simulated in Matlab through the Simulink environment using Xilinxʼs System Generator for DSP blocks. In the second part, DACʼs where added to the transmitter in order to compile it into a single block and test it on Nallatechʼs XtremeDSP Development Kit-IV, which includes a Field-Programmable Gate Array. All modules were designed using Xilinxʼs System Generator for DSP blocks. The kit is connected to the computer through a PCI interface. Output data is displayed on the Matlab environment for part one and on an oscilloscope for part two

    Integrated circuit & system design for concurrent amperometric and potentiometric wireless electrochemical sensing

    Get PDF
    Complementary Metal-Oxide-Semiconductor (CMOS) biosensor platforms have steadily grown in healthcare and commerial applications. This technology has shown potential in the field of commercial wearable technology, where CMOS sensors aid the development of miniaturised sensors for an improved cost of production and response time. The possibility of utilising wireless power and data transmission techniques for CMOS also allows for the monolithic integration of the communication, power and sensing onto a single chip, which greatly simplifies the post-processing and improves the efficiency of data collection. The ability to concurrently utilise potentiometry and amperometry as an electrochemical technique is explored in this thesis. Potentiometry and amperometry are two of the most common transduction mechanisms for electrochemistry, with their own advantages and disadvantages. Concurrently applying both techniques will allow for real-time calibration of background pH and for improved accuracy of readings. To date, developing circuits for concurrently sensing potentiometry and amperometry has not been explored in the literature. This thesis investigates the possibility of utilising CMOS sensors for wireless potentiometric and amperometric electrochemical sensing. To start with, a review of potentiometry and amperometry is evaluated to understand the key factors behind their operation. A new configuration is proposed whereby the reference electrode for both electrochemistry techniques are shared. This configuration is then compared to both the original configurations to determine any differences in the sensing accuracy through a novel experiment that utilises hydrogen peroxide as a measurement analyte. The feasibility of the configuration with the shared reference electrode is proven and utilised as the basis of the electrochemical configuration for the front end circuits. A unique front-end circuit named DAPPER is developed for the shared reference electrode topology. A review of existing architectures for potentiometry and amperometry is evaluated, with a specific focus on low power consumption for wireless applications. In addition, both the electrochemical sensing outputs are mixed into a single output data channel for use with a near-field communication (NFC). This mixing technique is also further analysed in this thesis to understand the errors arising due to various factors. The system is fabricated on TSMC 180nm technology and consumes 28µW. It measures a linear input current range from 250pA - 0.1µW, and an input voltage range of 0.4V - 1V. This circuit is tested and verified for both electrical and electrochemical tests to showcase its feasibility for concurrent measurements. This thesis then provides the integration of wireless blocks into the system for wireless powering and data transmission. This is done through the design of a circuit named SPACEMAN that consists of the concurrent sensing front-end, wireless power blocks, data transmission, as well as a state machine that allows for the circuit to switch between modes: potentiometry only, amperometry only, concurrent sensing and none. The states are switched through re-booting the circuit. The core size of the electronics is 0.41mm² without the coil. The circuit’s wireless powering and data transmission is tested and verified through the use of an external transmitter and a connected printed circuit board (PCB) coil. Finally, the future direction for ongoing work to proceed towards a fully monolithic electrochemical technique is discussed through the next development of a fully integrated coil-on-CMOS system, on-chip electrodes with the electroplating and microfludics, the development of an external transmitter for powering the device and a test platform. The contributions of this thesis aim to formulate a use for wireless electrochemical sensors capable of concurrent measurements for use in wearable devices.Open Acces

    Using embedded hardware monitor cores in critical computer systems

    Get PDF
    The integration of FPGA devices in many different architectures and services makes monitoring and real time detection of errors an important concern in FPGA system design. A monitor is a tool, or a set of tools, that facilitate analytic measurements in observing a given system. The goal of these observations is usually the performance analysis and optimisation, or the surveillance of the system. However, System-on-Chip (SoC) based designs leave few points to attach external tools such as logic analysers. Thus, an embedded error detection core that allows observation of critical system nodes (such as processor cores and buses) should enforce the operation of the FPGA-based system, in order to prevent system failures. The core should not interfere with system performance and must ensure timely detection of errors. This thesis is an investigation onto how a robust hardware-monitoring module can be efficiently integrated in a target PCI board (with FPGA-based application processing features) which is part of a critical computing system. [Continues.

    Energy-Efficient Wireless Circuits and Systems for Internet of Things

    Full text link
    As the demand of ultra-low power (ULP) systems for internet of thing (IoT) applications has been increasing, large efforts on evolving a new computing class is actively ongoing. The evolution of the new computing class, however, faced challenges due to hard constraints on the RF systems. Significant efforts on reducing power of power-hungry wireless radios have been done. The ULP radios, however, are mostly not standard compliant which poses a challenge to wide spread adoption. Being compliant with the WiFi network protocol can maximize an ULP radio’s potential of utilization, however, this standard demands excessive power consumption of over 10mW, that is hardly compatible with in ULP systems even with heavy duty-cycling. Also, lots of efforts to minimize off-chip components in ULP IoT device have been done, however, still not enough for practical usage without a clean external reference, therefore, this limits scaling on cost and form-factor of the new computer class of IoT applications. This research is motivated by those challenges on the RF systems, and each work focuses on radio designs for IoT applications in various aspects. First, the research covers several endeavors for relieving energy constraints on RF systems by utilizing existing network protocols that eventually meets both low-active power, and widespread adoption. This includes novel approaches on 802.11 communication with articulate iterations on low-power RF systems. The research presents three prototypes as power-efficient WiFi wake-up receivers, which bridges the gap between industry standard radios and ULP IoT radios. The proposed WiFi wake-up receivers operate with low power consumption and remain compatible with the WiFi protocol by using back-channel communication. Back-channel communication embeds a signal into a WiFi compliant transmission changing the firmware in the access point, or more specifically just the data in the payload of the WiFi packet. With a specific sequence of data in the packet, the transmitter can output a signal that mimics a modulation that is more conducive for ULP receivers, such as OOK and FSK. In this work, low power mixer-first receivers, and the first fully integrated ultra-low voltage receiver are presented, that are compatible with WiFi through back-channel communication. Another main contribution of this work is in relieving the integration challenge of IoT devices by removing the need for external, or off-chip crystals and antennas. This enables a small form-factor on the order of mm3-scale, useful for medical research and ubiquitous sensing applications. A crystal-less small form factor fully integrated 60GHz transceiver with on-chip 12-channel frequency reference, and good peak gain dual-mode on-chip antenna is presented.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162975/1/jaeim_1.pd

    Daily Eastern News: January 11, 1999

    Get PDF
    https://thekeep.eiu.edu/den_1999_jan/1000/thumbnail.jp
    • …
    corecore