43 research outputs found

    Dynamically reconfigurable bio-inspired hardware

    Get PDF
    During the last several years, reconfigurable computing devices have experienced an impressive development in their resource availability, speed, and configurability. Currently, commercial FPGAs offer the possibility of self-reconfiguring by partially modifying their configuration bitstream, providing high architectural flexibility, while guaranteeing high performance. These configurability features have received special interest from computer architects: one can find several reconfigurable coprocessor architectures for cryptographic algorithms, image processing, automotive applications, and different general purpose functions. On the other hand we have bio-inspired hardware, a large research field taking inspiration from living beings in order to design hardware systems, which includes diverse topics: evolvable hardware, neural hardware, cellular automata, and fuzzy hardware, among others. Living beings are well known for their high adaptability to environmental changes, featuring very flexible adaptations at several levels. Bio-inspired hardware systems require such flexibility to be provided by the hardware platform on which the system is implemented. In general, bio-inspired hardware has been implemented on both custom and commercial hardware platforms. These custom platforms are specifically designed for supporting bio-inspired hardware systems, typically featuring special cellular architectures and enhanced reconfigurability capabilities; an example is their partial and dynamic reconfigurability. These aspects are very well appreciated for providing the performance and the high architectural flexibility required by bio-inspired systems. However, the availability and the very high costs of such custom devices make them only accessible to a very few research groups. Even though some commercial FPGAs provide enhanced reconfigurability features such as partial and dynamic reconfiguration, their utilization is still in its early stages and they are not well supported by FPGA vendors, thus making their use difficult to include in existing bio-inspired systems. In this thesis, I present a set of architectures, techniques, and methodologies for benefiting from the configurability advantages of current commercial FPGAs in the design of bio-inspired hardware systems. Among the presented architectures there are neural networks, spiking neuron models, fuzzy systems, cellular automata and random boolean networks. For these architectures, I propose several adaptation techniques for parametric and topological adaptation, such as hebbian learning, evolutionary and co-evolutionary algorithms, and particle swarm optimization. Finally, as case study I consider the implementation of bio-inspired hardware systems in two platforms: YaMoR (Yet another Modular Robot) and ROPES (Reconfigurable Object for Pervasive Systems); the development of both platforms having been co-supervised in the framework of this thesis

    802.16e System Profile for NASA Extra-Vehicular Activities

    Get PDF
    This report identifies an 802.16e system profile that is applicable to a lunar surface wireless network, and specifically for meeting extra-vehicular activity (EVA) data flow requirements. EVA suit communication needs are addressed. Design-driving operational scenarios are considered. These scenarios are then used to identify a configuration of the 802.16e system (system profile) that meets EVA requirements, but also aim to make the radio realizable within EVA constraints. Limitations of this system configuration are highlighted. An overview and development status is presented by Toyon Research Corporation concerning the development of an 802.16e compatible modem under NASA s Small Business Innovative Research (SBIR) Program. This modem is based on the recommended system profile developed as part of this report. Last, a path forward is outlined that presents an evolvable solution for the EVA radio system and lunar surface radio networks. This solution is based on a custom link layer, and 802.16e compliant physical layer compliant to the identified system profile, and a later progression to a fully interoperable 802.16e system

    High-performance hardware accelerators for image processing in space applications

    Get PDF
    Mars is a hard place to reach. While there have been many notable success stories in getting probes to the Red Planet, the historical record is full of bad news. The success rate for actually landing on the Martian surface is even worse, roughly 30%. This low success rate must be mainly credited to the Mars environment characteristics. In the Mars atmosphere strong winds frequently breath. This phenomena usually modifies the lander descending trajectory diverging it from the target one. Moreover, the Mars surface is not the best place where performing a safe land. It is pitched by many and close craters and huge stones, and characterized by huge mountains and hills (e.g., Olympus Mons is 648 km in diameter and 27 km tall). For these reasons a mission failure due to a landing in huge craters, on big stones or on part of the surface characterized by a high slope is highly probable. In the last years, all space agencies have increased their research efforts in order to enhance the success rate of Mars missions. In particular, the two hottest research topics are: the active debris removal and the guided landing on Mars. The former aims at finding new methods to remove space debris exploiting unmanned spacecrafts. These must be able to autonomously: detect a debris, analyses it, in order to extract its characteristics in terms of weight, speed and dimension, and, eventually, rendezvous with it. In order to perform these tasks, the spacecraft must have high vision capabilities. In other words, it must be able to take pictures and process them with very complex image processing algorithms in order to detect, track and analyse the debris. The latter aims at increasing the landing point precision (i.e., landing ellipse) on Mars. Future space-missions will increasingly adopt Video Based Navigation systems to assist the entry, descent and landing (EDL) phase of space modules (e.g., spacecrafts), enhancing the precision of automatic EDL navigation systems. For instance, recent space exploration missions, e.g., Spirity, Oppurtunity, and Curiosity, made use of an EDL procedure aiming at following a fixed and precomputed descending trajectory to reach a precise landing point. This approach guarantees a maximum landing point precision of 20 km. By comparing this data with the Mars environment characteristics, it is possible to understand how the mission failure probability still remains really high. A very challenging problem is to design an autonomous-guided EDL system able to even more reduce the landing ellipse, guaranteeing to avoid the landing in dangerous area of Mars surface (e.g., huge craters or big stones) that could lead to the mission failure. The autonomous behaviour of the system is mandatory since a manual driven approach is not feasible due to the distance between Earth and Mars. Since this distance varies from 56 to 100 million of km approximately due to the orbit eccentricity, even if a signal transmission at the light speed could be possible, in the best case the transmission time would be around 31 minutes, exceeding so the overall duration of the EDL phase. In both applications, algorithms must guarantee self-adaptability to the environmental conditions. Since the Mars (and in general the space) harsh conditions are difficult to be predicted at design time, these algorithms must be able to automatically tune the internal parameters depending on the current conditions. Moreover, real-time performances are another key factor. Since a software implementation of these computational intensive tasks cannot reach the required performances, these algorithms must be accelerated via hardware. For this reasons, this thesis presents my research work done on advanced image processing algorithms for space applications and the associated hardware accelerators. My research activity has been focused on both the algorithm and their hardware implementations. Concerning the first aspect, I mainly focused my research effort to integrate self-adaptability features in the existing algorithms. While concerning the second, I studied and validated a methodology to efficiently develop, verify and validate hardware components aimed at accelerating video-based applications. This approach allowed me to develop and test high performance hardware accelerators that strongly overcome the performances of the actual state-of-the-art implementations. The thesis is organized in four main chapters. Chapter 2 starts with a brief introduction about the story of digital image processing. The main content of this chapter is the description of space missions in which digital image processing has a key role. A major effort has been spent on the missions in which my research activity has a substantial impact. In particular, for these missions, this chapter deeply analizes and evaluates the state-of-the-art approaches and algorithms. Chapter 3 analyzes and compares the two technologies used to implement high performances hardware accelerators, i.e., Application Specific Integrated Circuits (ASICs) and Field Programmable Gate Arrays (FPGAs). Thanks to this information the reader may understand the main reasons behind the decision of space agencies to exploit FPGAs instead of ASICs for high-performance hardware accelerators in space missions, even if FPGAs are more sensible to Single Event Upsets (i.e., transient error induced on hardware component by alpha particles and solar radiation in space). Moreover, this chapter deeply describes the three available space-grade FPGA technologies (i.e., One-time Programmable, Flash-based, and SRAM-based), and the main fault-mitigation techniques against SEUs that are mandatory for employing space-grade FPGAs in actual missions. Chapter 4 describes one of the main contribution of my research work: a library of high-performance hardware accelerators for image processing in space applications. The basic idea behind this library is to offer to designers a set of validated hardware components able to strongly speed up the basic image processing operations commonly used in an image processing chain. In other words, these components can be directly used as elementary building blocks to easily create a complex image processing system, without wasting time in the debug and validation phase. This library groups the proposed hardware accelerators in IP-core families. The components contained in a same family share the same provided functionality and input/output interface. This harmonization in the I/O interface enables to substitute, inside a complex image processing system, components of the same family without requiring modifications to the system communication infrastructure. In addition to the analysis of the internal architecture of the proposed components, another important aspect of this chapter is the methodology used to develop, verify and validate the proposed high performance image processing hardware accelerators. This methodology involves the usage of different programming and hardware description languages in order to support the designer from the algorithm modelling up to the hardware implementation and validation. Chapter 5 presents the proposed complex image processing systems. In particular, it exploits a set of actual case studies, associated with the most recent space agency needs, to show how the hardware accelerator components can be assembled to build a complex image processing system. In addition to the hardware accelerators contained in the library, the described complex system embeds innovative ad-hoc hardware components and software routines able to provide high performance and self-adaptable image processing functionalities. To prove the benefits of the proposed methodology, each case study is concluded with a comparison with the current state-of-the-art implementations, highlighting the benefits in terms of performances and self-adaptability to the environmental conditions

    Técnicas de pré-codificação para sistemas multicelulares coordenados

    Get PDF
    Doutoramento em TelecomunicaçõesCoordenação Multicélula é um tópico de investigação em rápido crescimento e uma solução promissora para controlar a interferência entre células em sistemas celulares, melhorando a equidade do sistema e aumentando a sua capacidade. Esta tecnologia já está em estudo no LTEAdvanced sob o conceito de coordenação multiponto (COMP). Existem várias abordagens sobre coordenação multicélula, dependendo da quantidade e do tipo de informação partilhada pelas estações base, através da rede de suporte (backhaul network), e do local onde essa informação é processada, i.e., numa unidade de processamento central ou de uma forma distribuída em cada estação base. Nesta tese, são propostas técnicas de pré-codificação e alocação de potência considerando várias estratégias: centralizada, todo o processamento é feito na unidade de processamento central; semidistribuída, neste caso apenas parte do processamento é executado na unidade de processamento central, nomeadamente a potência alocada a cada utilizador servido por cada estação base; e distribuída em que o processamento é feito localmente em cada estação base. Os esquemas propostos são projectados em duas fases: primeiro são propostas soluções de pré-codificação para mitigar ou eliminar a interferência entre células, de seguida o sistema é melhorado através do desenvolvimento de vários esquemas de alocação de potência. São propostas três esquemas de alocação de potência centralizada condicionada a cada estação base e com diferentes relações entre desempenho e complexidade. São também derivados esquemas de alocação distribuídos, assumindo que um sistema multicelular pode ser visto como a sobreposição de vários sistemas com uma única célula. Com base neste conceito foi definido uma taxa de erro média virtual para cada um desses sistemas de célula única que compõem o sistema multicelular, permitindo assim projectar esquemas de alocação de potência completamente distribuídos. Todos os esquemas propostos foram avaliados em cenários realistas, bastante próximos dos considerados no LTE. Os resultados mostram que os esquemas propostos são eficientes a remover a interferência entre células e que o desempenho das técnicas de alocação de potência propostas é claramente superior ao caso de não alocação de potência. O desempenho dos sistemas completamente distribuídos é inferior aos baseados num processamento centralizado, mas em contrapartida podem ser usados em sistemas em que a rede de suporte não permita a troca de grandes quantidades de informação.Multicell coordination is a promising solution for cellular wireless systems to mitigate inter-cell interference, improving system fairness and increasing capacity and thus is already under study in LTE-A under the coordinated multipoint (CoMP) concept. There are several coordinated transmission approaches depending on the amount of information shared by the transmitters through the backhaul network and where the processing takes place i.e. in a central processing unit or in a distributed way on each base station. In this thesis, we propose joint precoding and power allocation techniques considering different strategies: Full-centralized, where all the processing takes place at the central unit; Semi-distributed, in this case only some process related with power allocation is done at the central unit; and Fulldistributed, where all the processing is done locally at each base station. The methods are designed in two phases: first the inter-cell interference is removed by applying a set of centralized or distributed precoding vectors; then the system is further optimized by centralized or distributed power allocation schemes. Three centralized power allocation algorithms with per-BS power constraint and different complexity tradeoffs are proposed. Also distributed power allocation schemes are proposed by considering the multicell system as superposition of single cell systems, where we define the average virtual bit error rate (BER) of interference-free single cell system, allowing us to compute the power allocation coefficients in a distributed manner at each BS. All proposed schemes are evaluated in realistic scenarios considering LTE specifications. The numerical evaluations show that the proposed schemes are efficient in removing inter-cell interference and improve system performance comparing to equal power allocation. Furthermore, fulldistributed schemes can be used when the amounts of information to be exchanged over the backhaul is restricted, although system performance is slightly degraded from semi-distributed and full-centralized schemes, but the complexity is considerably lower. Besides that for high degrees of freedom distributed schemes show similar behaviour to centralized ones

    FPGA structures for high speed and low overhead dynamic circuit specialization

    Get PDF
    A Field Programmable Gate Array (FPGA) is a programmable digital electronic chip. The FPGA does not come with a predefined function from the manufacturer; instead, the developer has to define its function through implementing a digital circuit on the FPGA resources. The functionality of the FPGA can be reprogrammed as desired and hence the name “field programmable”. FPGAs are useful in small volume digital electronic products as the design of a digital custom chip is expensive. Changing the FPGA (also called configuring it) is done by changing the configuration data (in the form of bitstreams) that defines the FPGA functionality. These bitstreams are stored in a memory of the FPGA called configuration memory. The SRAM cells of LookUp Tables (LUTs), Block Random Access Memories (BRAMs) and DSP blocks together form the configuration memory of an FPGA. The configuration data can be modified according to the user’s needs to implement the user-defined hardware. The simplest way to program the configuration memory is to download the bitstreams using a JTAG interface. However, modern techniques such as Partial Reconfiguration (PR) enable us to configure a part in the configuration memory with partial bitstreams during run-time. The reconfiguration is achieved by swapping in partial bitstreams into the configuration memory via a configuration interface called Internal Configuration Access Port (ICAP). The ICAP is a hardware primitive (macro) present in the FPGA used to access the configuration memory internally by an embedded processor. The reconfiguration technique adds flexibility to use specialized ci rcuits that are more compact and more efficient t han t heir b ulky c ounterparts. An example of such an implementation is the use of specialized multipliers instead of big generic multipliers in an FIR implementation with constant coefficients. To specialize these circuits and reconfigure during the run-time, researchers at the HES group proposed the novel technique called parameterized reconfiguration that can be used to efficiently and automatically implement Dynamic Circuit Specialization (DCS) that is built on top of the Partial Reconfiguration method. It uses the run-time reconfiguration technique that is tailored to implement a parameterized design. An application is said to be parameterized if some of its input values change much less frequently than the rest. These inputs are called parameters. Instead of implementing these parameters as regular inputs, in DCS these inputs are implemented as constants, and the application is optimized for the constants. For every change in parameter values, the design is re-optimized (specialized) during run-time and implemented by reconfiguring the optimized design for a new set of parameters. In DCS, the bitstreams of the parameterized design are expressed as Boolean functions of the parameters. For every infrequent change in parameters, a specialized FPGA configuration is generated by evaluating the corresponding Boolean functions, and the FPGA is reconfigured with the specialized configuration. A detailed study of overheads of DCS and providing suitable solutions with appropriate custom FPGA structures is the primary goal of the dissertation. I also suggest different improvements to the FPGA configuration memory architecture. After offering the custom FPGA structures, I investigated the role of DCS on FPGA overlays and the use of custom FPGA structures that help to reduce the overheads of DCS on FPGA overlays. By doing so, I hope I can convince the developer to use DCS (which now comes with minimal costs) in real-world applications. I start the investigations of overheads of DCS by implementing an adaptive FIR filter (using the DCS technique) on three different Xilinx FPGA platforms: Virtex-II Pro, Virtex-5, and Zynq-SoC. The study of how DCS behaves and what is its overhead in the evolution of the three FPGA platforms is the non-trivial basis to discover the costs of DCS. After that, I propose custom FPGA structures (reconfiguration controllers and reconfiguration drivers) to reduce the main overhead (reconfiguration time) of DCS. These structures not only reduce the reconfiguration time but also help curbing the power hungry part of the DCS system. After these chapters, I study the role of DCS on FPGA overlays. I investigate the effect of the proposed FPGA structures on Virtual-Coarse-Grained Reconfigurable Arrays (VCGRAs). I classify the VCGRA implementations into three types: the conventional VCGRA, partially parameterized VCGRA and fully parameterized VCGRA depending upon the level of parameterization. I have designed two variants of VCGRA grids for HPC image processing applications, namely, the MAC grid and Pixie. Finally, I try to tackle the reconfiguration time overhead at the hardware level of the FPGA by customizing the FPGA configuration memory architecture. In this part of my research, I propose to use a parallel memory structure to improve the reconfiguration time of DCS drastically. However, this improvement comes with a significant overhead of hardware resources which will need to be solved in future research on commercial FPGA configuration memory architectures

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs

    Intrinsic Hardware Evolution on the Transistor Level

    Get PDF
    This thesis presents a novel approach to the automated synthesis of analog circuits. Evolutionary algorithms are used in conjunction with a fitness evaluation on a dedicated ASIC that serves as the analog substrate for the newly bred candidate solutions. The advantage of evaluating the candidate circuits directly in hardware is twofold. First, it may speed up the evolutionary algorithms, because hardware tests can usually be performed faster than simulations. Second, the evolved circuits are guaranteed to work on a real piece of silicon. The proposed approach is realized as a hardware evolution system consisting of an IBM compatible general purpose computer that hosts the evolutionary algorithm, an FPGA-based mixed signal test board, and the analog substrate. The latter one is designed as a Field Programmable Transistor Array (FPTA) whose programmable transistor cells can be almost freely connected. The transistor cells can be configured to adopt one out of 75 different channel geometries. The chip was produced in a 0.6µm CMOS process and provides ample means for the input and output of analog signals. The configuration is stored in SRAM cells embedded in the programmable transistor cells. The hardware evolution system is used for numerous evolution experiments targeted at a wide variety of different circuit functionalities. These comprise logic gates, Gaussian function circuits, D/A converters, low- and highpass filters, tone discriminators, and comparators. The experimental results are thoroughly analyzed and discussed with respect to related work
    corecore