22 research outputs found

    A wave pipeline-based WCDMA multipath searcher for a high speed operation

    Get PDF
    The multiplexing technique of the Wideband-Code Division Multiple Access (WCDMA) is widely applied in the third generation (3G) of cellular systems. The WCDMA uses scrambling codes to differentiate the mobile terminals. In a channel, multipaths may occur when the transmitted signal is reflected from objects in the receiver's environment, so that multiple copies of the signal arrive at the antenna at different moments. Thus, a wideband signal may suffer frequency selective fading due to the multipath propagations. A Rake receiver is often used to combine the energies received on different paths, and a multipath searcher is needed to identify the multipath components and their associated delays. Correlating some shifted versions of the scrambling code with an incoming signal results in energy peaks at the multipath locations, when the locally generated scrambling sequence is aligned with the scrambling sequence of the incoming signal. A path acquisition in such a process requires a speed of millions of Multiply-Accumulate (MAC) cycles per second. The performances of the multipath searcher are mainly determined by the resolution and the acquisition time, which are often limited by the operation speed of the hardware resources. This thesis presents the design of a multipath searcher with a high resolution and a short acquisition time. The design consists of two aspects. The first aspect is of the searching algorithm. It is based on a double-dwell algorithm and a verification stage is introduced to lower the rate of false alarms. The second aspect in the design is the circuit of the searcher. This circuit is expected to operate at the chip rate of 3.84 MHz and the search period is chosen to be equal to the time interval of 5 slots, which requires a high operation speed of the computation units employed in the circuit. Moreover, in order to reduce the circuit complexity, only one Complex Multiplier-Accumulator (CMAC), instead of several ones in many existing searcher circuits, is employed to perform all the computation tasks without extending the search period, which make the computation time in the circuit more critical. Aiming at this challenge of the high speed requirement, a structure of the CMAC cell is designed with the technique of the wave pipeline, which permits the signal propagation through the circuit stages without constraints of clocks. For a good use of this technique, the circuit blocks are made to have equalized delay, by means of pass transistor logic cells, and by keeping such a delay short, the total computation time of the CMAC can be made within the required time limit of the searching. A complete circuit of the CMAC has been developed. It has two versions, with the Normal Process Complementary Pass Logic (NPCPL) and the Complementary Pass-Logic Transmission-Gates (CPL-TG), respectively. The structures of the arithmetic units have been chosen carefully so that the fan-in/fan-out constraints of the NPCPL and the CPL-TG logics are taken into consideration. The results of the simulation with a 0.18 om models have shown that this wave pipelined CMAC can process four inputs of 8 bits at a rate of 830 Mb/s. In order to evaluate the effectiveness of the searching algorithm, a Matlab simulation of the searcher circuit has been conducted. It has been observed that the proposed multipath searcher can lead to low probabilities of misdetection and false alarm for the test cases recommended by the 3 rd Generation Partnership Project (3GPP) standard. A test chip of the CMAC circuit has been fabricated in a CMOS 0.18 om technology process. The circuit is currently under test

    On the Exploration of FPGAs and High-Level Synthesis Capabilities on Multi-Gigabit-per-Second Networks

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones. Fecha de lectura: 24-01-2020Traffic on computer networks has faced an exponential grown in recent years. Both links and communication equipment had to adapt in order to provide a minimum quality of service required for current needs. However, in recent years, a few factors have prevented commercial off-the-shelf hardware from being able to keep pace with this growth rate, consequently, some software tools are struggling to fulfill their tasks, especially at speeds higher than 10 Gbit/s. For this reason, Field Programmable Gate Arrays (FPGAs) have arisen as an alternative to address the most demanding tasks without the need to design an application specific integrated circuit, this is in part to their flexibility and programmability in the field. Needless to say, developing for FPGAs is well-known to be complex. Therefore, in this thesis we tackle the use of FPGAs and High-Level Synthesis (HLS) languages in the context of computer networks. We focus on the use of FPGA both in computer network monitoring application and reliable data transmission at very high-speed. On the other hand, we intend to shed light on the use of high level synthesis languages and boost FPGA applicability in the context of computer networks so as to reduce development time and design complexity. In the first part of the thesis, devoted to computer network monitoring. We take advantage of the FPGA determinism in order to implement active monitoring probes, which consist on sending a train of packets which is later used to obtain network parameters. In this case, the determinism is key to reduce the uncertainty of the measurements. The results of our experiments show that the FPGA implementations are much more accurate and more precise than the software counterpart. At the same time, the FPGA implementation is scalable in terms of network speed — 1, 10 and 100 Gbit/s. In the context of passive monitoring, we leverage the FPGA architecture to implement algorithms able to thin cyphered traffic as well as removing duplicate packets. These two algorithms straightforward in principle, but very useful to help traditional network analysis tools to cope with their task at higher network speeds. On one hand, processing cyphered traffic bring little benefits, on the other hand, processing duplicate traffic impacts negatively in the performance of the software tools. In the second part of the thesis, devoted to the TCP/IP stack. We explore the current limitations of reliable data transmission using standard software at very high-speed. Nowadays, the network is becoming an important bottleneck to fulfill current needs, in particular in data centers. What is more, in recent years the deployment of 100 Gbit/s network links has started. Consequently, there has been an increase scrutiny of how networking functionality is deployed, furthermore, a wide range of approaches are currently being explored to increase the efficiency of networks and tailor its functionality to the actual needs of the application at hand. FPGAs arise as the perfect alternative to deal with this problem. For this reason, in this thesis we develop Limago an FPGA-based open-source implementation of a TCP/IP stack operating at 100 Gbit/s for Xilinx’s FPGAs. Limago not only provides an unprecedented throughput, but also, provides a tiny latency when compared to the software implementations, at least fifteen times. Limago is a key contribution in some of the hottest topic at the moment, for instance, network-attached FPGA and in-network data processing

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs

    Error minimising gradients for improving cerebellar model articulation controller performance

    Get PDF
    In motion control applications where the desired trajectory velocity exceeds an actuator’s maximum velocity limitations, large position errors will occur between the desired and actual trajectory responses. In these situations standard control approaches cannot predict the output saturation of the actuator and thus the associated error summation cannot be minimised.An adaptive feedforward control solution such as the Cerebellar Model Articulation Controller (CMAC) is able to provide an inherent level of prediction for these situations, moving the system output in the direction of the excessive desired velocity before actuator saturation occurs. However the pre-empting level of a CMAC is not adaptive, and thus the optimal point in time to start moving the system output in the direction of the excessive desired velocity remains unsolved. While the CMAC can adaptively minimise an actuator’s position error, the minimisation of the summation of error over time created by the divergence of the desired and actual trajectory responses requires an additional adaptive level of control.This thesis presents an improved method of training CMACs to minimise the summation of error over time created when the desired trajectory velocity exceeds the actuator’s maximum velocity limitations. This improved method called the Error Minimising Gradient Controller (EMGC) is able to adaptively modify a CMAC’s training signal so that the CMAC will start to move the output of the system in the direction of the excessive desired velocity with an optimised pre-empting level.The EMGC was originally created to minimise the loss of linguistic information conveyed through an actuated series of concatenated hand sign gestures reproducing deafblind sign language. The EMGC concept however is able to be implemented on any system where the error summation associated with excessive desired velocities needs to be minimised, with the EMGC producing an improved output approximation over using a CMAC alone.In this thesis, the EMGC was tested and benchmarked against a feedforward / feedback combined controller using a CMAC and PID controller. The EMGC was tested on an air-muscle actuator for a variety of situations comprising of a position discontinuity in a continuous desired trajectory. Tested situations included various discontinuity magnitudes together with varying approach and departure gradient profiles.Testing demonstrated that the addition of an EMGC can reduce a situation’s error summation magnitude if the base CMAC controller has not already provided a prior enough pre-empting output in the direction of the situation. The addition of an EMGC to a CMAC produces an improved approximation of reproduced motion trajectories, not only minimising position error for a single sampling instance, but also over time for periodic signals

    Using evolutionary artificial neural networks to design hierarchical animat nervous systems.

    Get PDF
    The research presented in this thesis examines the area of control systems for robots or animats (animal-like robots). Existing systems have problems in that they require a great deal of manual design or are limited to performing jobs of a single type. For these reasons, a better solution is desired. The system studied here is an Artificial Nervous System (ANS) which is biologically inspired; it is arranged as a hierarchy of layers containing modules operating in parallel. The ANS model has been developed to be flexible, scalable, extensible and modular. The ANS can be implemented using any suitable technology, for many different environments. The implementation focused on the two lowest layers (the reflex and action layers) of the ANS, which are concerned with control and rhythmic movement. Both layers were realised as Artificial Neural Networks (ANN) which were created using Evolutionary Algorithms (EAs). The task of the reflex layer was to control the position of an actuator (such as linear actuators or D.C. motors). The action layer performed the task of Central Pattern Generators (CPG), which produce rhythmic patterns of activity. In particular, different biped and quadruped gait patterns were created. An original neural model was specifically developed for assisting in the creation of these time-based patterns. It is shown in the thesis that Artificial Reflexes and CPGs can be configured successfully using this technique. The Artificial Reflexes were better at generalising across different actuators, without changes, than traditional controllers. Gaits such as pace, trot, gallop and pronk were successfully created using the CPGs. Experiments were conducted to determine whether modularity in the networks had an impact. It has been demonstrated that the degree of modularization in the network influences its evolvability, with more modular networks evolving more efficiently

    The Fifth NASA Symposium on VLSI Design

    Get PDF
    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design

    Kommunikation und Bildverarbeitung in der Automation

    Get PDF
    In diesem Open-Access-Tagungsband sind die besten Beiträge des 9. Jahreskolloquiums "Kommunikation in der Automation" (KommA 2018) und des 6. Jahreskolloquiums "Bildverarbeitung in der Automation" (BVAu 2018) enthalten. Die Kolloquien fanden am 20. und 21. November 2018 in der SmartFactoryOWL, einer gemeinsamen Einrichtung des Fraunhofer IOSB-INA und der Technischen Hochschule Ostwestfalen-Lippe statt. Die vorgestellten neuesten Forschungsergebnisse auf den Gebieten der industriellen Kommunikationstechnik und Bildverarbeitung erweitern den aktuellen Stand der Forschung und Technik. Die in den Beiträgen enthaltenen anschaulichen Beispiele aus dem Bereich der Automation setzen die Ergebnisse in den direkten Anwendungsbezug

    Novel Architectures for Offloading and Accelerating Computations in Artificial Intelligence and Big Data

    Get PDF
    Due to the end of Moore's Law and Dennard Scaling, performance gains in general-purpose architectures have significantly slowed in recent years. While raising the number of cores has been a viable approach for further performance increases, Amdahl's Law and its implications on parallelization also limit further performance gains. Consequently, research has shifted towards different approaches, including domain-specific custom architectures tailored to specific workloads. This has led to a new golden age for computer architecture, as noted in the Turing Award Lecture by Hennessy and Patterson, which has spawned several new architectures and architectural advances specifically targeted at highly current workloads, including Machine Learning. This thesis introduces a hierarchy of architectural improvements ranging from minor incremental changes, such as High-Bandwidth Memory, to more complex architectural extensions that offload workloads from the general-purpose CPU towards more specialized accelerators. Finally, we introduce novel architectural paradigms, namely Near-Data or In-Network Processing, as the most complex architectural improvements. This cumulative dissertation then investigates several architectural improvements to accelerate Sum-Product Networks, a novel Machine Learning approach from the class of Probabilistic Graphical Models. Furthermore, we use these improvements as case studies to discuss the impact of novel architectures, showing that minor and major architectural changes can significantly increase performance in Machine Learning applications. In addition, this thesis presents recent works on Near-Data Processing, which introduces Smart Storage Devices as a novel architectural paradigm that is especially interesting in the context of Big Data. We discuss how Near-Data Processing can be applied to improve performance in different database settings by offloading database operations to smart storage devices. Offloading data-reductive operations, such as selections, reduces the amount of data transferred, thus improving performance and alleviating bandwidth-related bottlenecks. Using Near-Data Processing as a use-case, we also discuss how Machine Learning approaches, like Sum-Product Networks, can improve novel architectures. Specifically, we introduce an approach for offloading Cardinality Estimation using Sum-Product Networks that could enable more intelligent decision-making in smart storage devices. Overall, we show that Machine Learning can benefit from developing novel architectures while also showing that Machine Learning can be applied to improve the applications of novel architectures

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov
    corecore