    Computing moments of a binary horizontally/vertically convex image using run-time reconfiguration

    In this thesis, we present a design for computing moments of a binary horizontally/vertically convex image on an FPGA chip, using run-time reconfiguration. We compute the moments of up to third order for a total of 16 moments. We address how run-time reconfiguration speeds up moment computations without taking up huge hardware resources. Since we are considering a binary horizontally/vertically convex image, we look at an alternative method in moment computations that utilizes constant coefficient multipliers. We divide the image into segments and process one segment at a time. We reconfigure the constant coefficient multipliers before processing the next segment. This thesis also looks at the interactions between different logic units for moment computations. We provide an estimate of the total number of CLBs used to implement this design on an FPGA chip. Finally, we address variations of this particular type of image, such as non-binary and non-convex and determine whether this design is still applicable in those instances

    An embedded adaptive optics real time controller

    The design and realisation of a low cost, high speed control system for adaptive optics (AO) is presented. This control system is built around a field programmable gate array (FPGA). FPGA devices represent a fundamentally different approach to implementing control systems than conventional central processing units. The performance of the FPGA control system is demonstrated in a specifically constructed laboratory AO experiment where closed loop AO correction is shown. An alternative application of the control system is demonstrated in the field of optical tweezing, where it is used to study the motion dynamics of particles trapped within laser foci

    A common operator for FFT and FEC decoding

    International audienceIn the Software Radio context, the parametrization is becoming an important topic especially when it comes to multistandard designs. This paper capitalizes on the Common Operator technique to present new common structures for the FFT and FEC decoding algorithms. A key benefit of exhibiting common operators is the regular architecture it brings when implemented in a Common Operator Bank (COB). This regularity makes the architecture open to future function mapping and adapted to accommodated silicon technology variability through dependable design

    Enhancement of Digital Photo Frame Capabilities With Dedicated Hardware

    Photo frames have come a long way since the typical ones that needed to have a photo printed and stuck on them. Today in this digital era we have a new concept, named digital photo frame, a modern representation of the conventional photo frame. A digital photo frame is basically a picture frame that displays photos without the need to print them. They are available in a variety of sizes and with varied configurations. A typical frame varies in size from 7 inches to 20 inches. There are also key chain sized frames available. These frames also support a variety of formats like .jpeg, .tiff, .bmp and so on. Most of the frames provide an option to run the photos in a sequential or random manner as a slideshow with an adjustable time interval. The mode of input of the photos to the frame is also multi-fold. It can be done directly via the memory card of the camera, or else various memory devices like USB drives, SD Cards, MMC Cards and so on can be used. Nowadays even Bluetooth technology is being used. Another option that is becoming quite popular is that, users can take their photos directly from the Internet from sites like Flickr, Picassa or from their e-mail. Also these frames generally come with built in speakers and with remote controls. Our initial objective was to decide on which all features can be added to the Digital Photo Frame that we design. For this purpose we conducted simulation exercises in MATLAB so as to prove its feasibility. This simulation exercise was divided into two parts. The first part was to perform compression and decompression and the second half dealt with the various enhancements that can be added to the frame. For our compression and decompression we considered the JPEG standard. Joint Photographic Experts Group - an ISO/ITU standard for compressing still images. The JPEG format is very popular due to its variable compression range. A few limitations of JPEG include the fact that it is lossy and also not great for displaying text. The common extension for it include *.jpg, *.jff, *.m-jpeg,*.mpeg The various enhancement features that we tested for feasibility include Mean Filter, Median Filter, Image Sharpening, Negative Image Extraction, Logarithmic Transformations, Power Law Correction (Gamma Correction), Contrast Stretching, Grey Level Slicing, Bit Plane Slicing, Laplace Filtering. We then proceeded onto the hardware implementation of the above said features. We only implemented a handful of features owing to the complexity of design and lack of time. We first implemented the Compression and Decompression algorithm. The two enhancement features we implemented were Laplace Filter and Median Filter. For our implementation we used the VIRTEX 2 FPGA Board

    Adaptive image filtering using run-time reconfiguration

    This thesis implements an adaptive linear smoothing image filtering algorithm, on a Virtex™-E FPGA using run-time reconfiguration (RTR). An adaptive filter uses a filtering window that runs over the entire image pixel-by-pixel, generating new (filtered) values of the pixels. As the name suggests, an adaptive filter can adapt to the varying nature of an image by adjusting the coefficients of the filtering window depending upon the local variance in the intensity values of pixels. It filters an image in a non-uniform fashion providing greater smoothing in largely uniform areas of the image and lesser smoothing when it encounters edges and step changes in the image. These continual changes, in the coefficient values of the adaptive filter pose a problem in utilizing run-time reconfiguration (RTR) for its implementation, as benefits of RTR emerge only with considerable computing time between reconfigurations. This thesis provides a solution to this problem and reduces the running time of the algorithm through aggressive use of RTR. This work provides details on the RTR implementation of an adaptive filter, along with an estimate of running time and hardware resource requirements, when synthesized on the Virtex™-E FPGA. We use a 3 ×3 size filtering window, and a 256 256 ×size gray scale image as a specific case, achieving speedup of 31 and 84 over pure software implementations running on Pentium III and Sun Ultra systems respectively

    Conformação de pulso de formas de onda OFDM para a interface aérea 5G

    Orientador: Luís Geraldo Pedroso MeloniDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: As formas de onda com multiplexação ortogonal por divisão de freqüência (OFDM) foram utilizadas com sucesso na interface aérea 3GPP LTE para superar a seletividade do canal e proporcionar uma boa eficiência espectral e altas taxas de transmissão de dados. O próximo sistema de comunicações 5G tem como objetivo oferecer suporte a mais serviços do que o antecessor, como comunicações de banda larga móveis, comunicações de tipo máquina e comunicações de baixa latência, e considera muitos outros cenários de aplicação, como o uso de espectro fragmentado. Esta diversidade de serviços com diferentes requisitos não pode ser suportada pela OFDM convencional, pois OFDM configura toda a largura de banda com parâmetros que atendem a um serviço em particular. Além disso, pode ocorrer interferência interportadora (ICI) quando a OFDM convencional é usada com multiplexação assíncrona de múltiplos usuários e isso é devido às altas emissões fora de banda (OOB) das subportadoras e à violação da condição de ortogonalidade do sinal. Portanto, para atender aos requisitos das futuras aplicações sem fio 5G, o desenvolvimento de uma interface aérea inovadora com novas capacidades torna-se necessário, em particular, uma nova forma de onda mais espectralmente ágil do que OFDM capaz de suportar múltiplas configurações, suprimindo efetivamente a interferência entre usuários, e com integração direta com as camadas superiores. Este trabalho centra-se em duas técnicas de conformação de pulsos para reduzir a emissões fora de banda e melhorar o desempenho de formas de onda baseadas em OFDM. A conformação de pulsos pode permitir o uso de parametrizações múltiplas dentro da forma de onda e abandonar os paradigmas rígidos de ortogonalidade e sincronismo com uma degradação de desempenho causada por interferência intersymbol (ISI) e ICI relativamente baixa. A primeira parte aborda um método de modelagem de pulso baseado na filtragem por subportadora para reduzir a emissão fora de banda no transmissor e interferência de canal adjacente (ACI) no receptor. Ele pode ser implementado usando funções de janela e alguns formatos de janela são apresentados nesta parte. O primeiro usa o prefixo cíclico (CP) existente dos símbolos para suavizar as transições abruptas do sinal, portanto, os grandes lóbulos espectrais sinc causados pelos filtros retangulares. Isso garante a compatibilidade retroativa em sistemas que usam OFDM com prefixo cíclico (CP-OFDM). O formato da segunda janela estende o comprimento do CP para reter a capacidade da forma de onda para combater a propagação do atraso do canal. Os efeitos no desempenho do ISI e ICI são estudados em termos de relação de sinal para interferência (SIR) e taxa de erro de bit (BER) usando formas de onda LTE em um cenário de espectro fragmentado multi-usuário. A segunda parte deste trabalho aborda o desenho e análise de filtros para a contenção espectral flexível em transceptores com filtragem baseada em sub-banda. Este filtro, chamado aqui semi-equiripple, exibe melhor atenuação na banda de rejeição para reduzir as interferências entre subbandas do que os filtros equiripple e filtros sinc baseados em janelamento e também possui boas características de resposta ao impulso para reduzir o ISI. O projeto de filtros baseia-se no algoritmo Parks-McClellan para obter diferentes taxas de decaimento da banda de parada e atende a especificações arbitrárias de máscaras de emissão de espectro (SEM) com baixa distorção dentro da banda. Portanto, pode ser útil para obter baixas emissões fora da banda e configurar sub-bandas com parâmetros independentes, uma vez que a interferência assíncrona é contida pelos filtros. São estudadas três distorções de ISI no filtro: espalhamento de símbolos relacionado à causalidade do filtro, ecos de símbolos devido a ondulações na banda e amplificação de ISI devido a amostras de valores anômalas nas caudas de sua resposta de impulso. O desempenho do filtro é avaliado em termos de densidade de espectro de potência (PSD) e conformidade com SEMs, taxa de erro de modulação (MER) e operação em um esquema assíncrono multi-serviço usando uma única forma de onda. O SIR e o efeito da filtragem na precisão da modulação são avaliados usando formas de onda OFDM ISDB-T e LTE. Estruturas de hardware flexíveis também são propostas para implementações reais. Os resultados mostram que esses métodos de conformação de pulso permitem que a forma de onda explore os fragmentos de espectro disponíveis e ofereça suporte a múltiplos serviços sem uma penalidade de desempenho significativa, o que pode permitir uma interface aérea mais flexívelAbstract: Orthogonal frequency division multiplexing (OFDM) waveforms have been used successfully in the 3GPP Long Term Evolution (LTE) air interface to overcome the channel selectivity and to provide good spectrum efficiency and high transmission data rates. The forthcoming 5G communication system aims to support more services than its predecessor, such as enhanced mobile broadband, machine-type communications and low latency communications, and considers many other application scenarios such as the fragmented spectrum use. This diversity of services with different requirements cannot be supported by conventional OFDM since OFDM configures the entire bandwidth with parameters attending one service in particular. Also, substantial intercarrier interference (ICI) can occur when conventional OFDM is used with asynchronous multiuser multiplexing and this is due to the high out-of-band (OOB) emissions of the subcarriers and the violation of the signal orthogonality constraint. Therefore, to meet the requirements of future 5G wireless applications, the development of an innovative air interface with new capabilities becomes necessary, in particular, a new waveform more spectrally agile than OFDM capable of supporting multiple configurations, suppressing the inter-user interference effectively, and with straightforward integration with the upper layers. This work focuses on two pulse shaping techniques to reduce the OOB emission and improve the in-band and OOB performances of OFDM-based waveforms. Pulse shaping can enable the use of multiple parameterizations within the waveform and abandon the strict paradigms of orthogonality and synchronism with relatively low performance degradation caused by intersymbol interference (ISI) and ICI. The first part addresses a pulse shaping method based on per-subcarrier filtering to reduce both OOB emission in the transmitter and adjacent channel interference (ACI) in the receiver. It can be implemented using window functions and some window formats are presented in this part. The first uses the existing cyclic prefix (CP) of OFDM symbols to smooth abrupt transitions of the signal, thus the large sinc spectral sidelobes caused by the rectangular filters. This guarantees backwards compatibility in systems using conventional cyclic prefixed OFDM (CP-OFDM). The second window format extends the CP length to retain the waveform ability to combat channel delay spread. The effects on performance of ISI and ICI are studied in terms of the signal to interference ratio (SIR) and bit error rate (BER) using LTE waveforms in a multi-user fragmented spectrum scenario. The second part of this work addresses the design and analysis of a filters for flexible spectral containment in subband-based filtering transceivers. This filter, called here semi-equiripple, exhibits better stopband attenuation to reduce the inter-subband interferences than equiripple and windowed truncated sinc filters and also has good impulse response characteristics to reduce ISI. The design is based on the Parks-McClellan algorithm to obtain different stopband decay rates and meet arbitrary spectrum emission masks (SEM) specifications with low in-band distortion. Therefore, it can be useful to achieve low OOB emission and configure subbands with independent parameters since the asynchronous interference is contained by the filters. Three ISI distortions in the filter are studied: symbol spreading related to the filter causality, symbol echoes due to in-band ripples, and ISI amplification due to outlier samples in the tails of its impulse response. The performance of the filter is assessed in terms of the power spectrum density (PSD) and compliance with tight SEMs, modulation error rate (MER) and operation in a multi-service asynchronous scheme using a single waveform. The SIR and the effect of filtering on the modulation accuracy are evaluated using OFDM ISDB-T and LTE waveforms. Flexible hardware structures are also proposed for actual implementations. The results show that these pulse shaping methods enable the waveform to exploit the available spectrum fragments and support multiple services without significant performance penalty, which can allow a more flexible air interfaceMestradoTelecomunicações e TelemáticaMestre em Engenharia ElétricaCAPE

    Domain-Specific Computing Architectures and Paradigms

    We live in an exciting era where artificial intelligence (AI) is fundamentally shifting the dynamics of industries and businesses around the world. AI algorithms such as deep learning (DL) have drastically advanced the state-of-the-art cognition and learning capabilities. However, the power of modern AI algorithms can only be enabled if the underlying domain-specific computing hardware can deliver orders of magnitude more performance and energy efficiency. This work focuses on this goal and explores three parts of the domain-specific computing acceleration problem; encapsulating specialized hardware and software architectures and paradigms that support the ever-growing processing demand of modern AI applications from the edge to the cloud. This first part of this work investigates the optimizations of a sparse spatio-temporal (ST) cognitive system-on-a-chip (SoC). This design extracts ST features from videos and leverages sparse inference and kernel compression to efficiently perform action classification and motion tracking. The second part of this work explores the significance of dataflows and reduction mechanisms for sparse deep neural network (DNN) acceleration. This design features a dynamic, look-ahead index matching unit in hardware to efficiently discover fine-grained parallelism, achieving high energy efficiency and low control complexity for a wide variety of DNN layers. Lastly, this work expands the scope to real-time machine learning (RTML) acceleration. A new high-level architecture modeling framework is proposed. Specifically, this framework consists of a set of high-performance RTML-specific architecture design templates, and a Python-based high-level modeling and compiler tool chain for efficient cross-stack architecture design and exploration.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162870/1/lchingen_1.pd

    Dynamic Scheduling, Allocation, and Compaction Scheme for Real-Time Tasks on FPGAs

    Run-time reconfiguration (RTR) is a method of computing on reconfigurable logic, typically FPGAs, changing hardware configurations from phase to phase of a computation at run-time. Recent research has expanded from a focus on a single application at a time to encompass a view of the reconfigurable logic as a resource shared among multiple applications or users. In real-time system design, task deadlines play an important role. Real-time multi-tasking systems not only need to support sharing of the resources in space, but also need to guarantee execution of the tasks. At the operating system level, sharing logic gates, wires, and I/O pins among multiple tasks needs to be managed. From the high level standpoint, access to the resources needs to be scheduled according to task deadlines. This thesis describes a task allocator for scheduling, placing, and compacting tasks on a shared FPGA under real-time constraints. Our consideration of task deadlines is novel in the setting of handling multiple simultaneous tasks in RTR. Software simulations have been conducted to evaluate the performance of the proposed scheme. The results indicate significant improvement by decreasing the number of tasks rejected