62 research outputs found

    Nonlinear moving-horizon state estimation for hardware implementation and a model predictive control application

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2021.Nesta dissertação, exploramos a aplicação de redes neurais artificiais de funções de base radial (RBFs) embutidas em hardware para estimação de estados e controle em tempo real utilizando os algoritmos de Moving-Horizon Estimation(MHE) e Model Predictive Control (MPC). Esses algoritmos foram posteriormente aproximados por RBFs e implementados em um Field Programmable Gate Array (FPGA), que tem mostrado bons resultados em termos de precisão e tempo ˜ computacional. Mostramos que a estimativa de estado usando a versão aproximada do MHE ˜ pode ser executada usando um kit em escala de laboratório de aproximadamente 500 kHz para ´ um pendulo invertido a uma taxa de clock de cerca de 110 MHz. A latência para fornecer uma estimativa pode ser reduzida ainda mais quando FPGAs com clocks mais altos são usados, pois a ˜ arquitetura da rede neural artificial e inerentemente paralela. Após uma inspeção mais detalhada, ˜ descobriu-se que era possível reduzir o custo da área de chip trocando a função de custo por uma ˜ com resultados mais facilmente representáveis. Ele poderia então utilizar uma representação em ˜ 32 bits e o modulo CORDIC poderia ser removido, usando apenas a aproximação mais simples da ˜ serie de Taylor de 2 ´ ª ordem. Em seguida, expandimos isso, investigando a ideia de usar uma única rede neural para substituir tanto o controle quanto o estimatidor de estados. Comparado a um MPC com informações completas, sua versão utilizando o MHE não teve um bom desempenho contra ˜ ruídos de saída. A princípio não foi possível aproximar o controle e a estimativa do pêndulo com um bom resultado, porem ao separar o controle em duas partes obtivemos melhores resultados. Por fim, verificamos que tal rede neural foi capaz de estabilizar o sistema de pendulo invertido, ˆ mas não de aproximar sua parte oscilante n ˜ ao linear. A solução aqui apresentada ˜ e encorajada a ser estendida para sistemas mais complexos e não lineares, uma vez que uma arquitetura com ˜ complexidade razoável é encontrada para a rede neural artificial para ser implementada.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).In this dissertation, we explore the application of radial basis functions (RBFs) artificial neural networks embedded in hardware for real-time estimation and control algorithms as the Moving- Horizon Estimation (MHE) and the Model Predictive Control (MPC). These algorithms are then approximated using RBFs and implemented in a Field Programmable Gate Array (FPGA), which has shown good results in terms of accuracy and computational time. We show that the state estimate using the approximate version of the MHE can be run using a laboratory-scale kit of approximately 500 kHz for an inverted pendulum at a clock rate of about 110 MHz. The latency to provide an estimate can be further reduced when FPGAs with higher clocks are used as the artificial neural network architecture is inherently parallel. Upon further inspection, it was found to be possible to reduce the chip area cost by switching the cost function for one with more easily representable results. It could then utilize a 32-bits representation and the CORDIC module could be removed, using instead only the simpler 2o order Taylor approximation. We then expand upon this, probing at the idea of using a single neural network to substitute both the control and state-estimation. Compared to a MPC with full information, its version utilizing the MHE did not perform well against output noises. At first, it was not possible to approximate the pendulum control and estimation with a good result, however when separating the control in two parts we gained better outcomes. Lastly, we verify that such a neural network was capable of stabilizing the inverted pendulum system, but not of approximating the non-linear swing-up part of it. The solution herein presented is encouraged to be further extended for more complex and nonlinear systems, given that an architecture is found for the artificial neural network with reasonable complexity to be implemented

    Field programmable gate array based sigmoid function implementation using differential lookup table and second order nonlinear function

    Get PDF
    Artificial neural network (ANN) is an established artificial intelligence technique that is widely used for solving numerous problems such as classification and clustering in various fields. However, the major problem with ANN is a factor of time. ANN takes a longer time to execute a huge number of neurons. In order to overcome this, ANN is implemented into hardware namely field-programmable-gate-array (FPGA). However, implementing the ANN into a field-programmable gate array (FPGA) has led to a new problem related to the sigmoid function implementation. Often used as the activation function for ANN, a sigmoid function cannot be directly implemented in FPGA. Owing to its accuracy, the lookup table (LUT) has always been used to implement the sigmoid function in FPGA. In this case, obtaining the high accuracy of LUT is expensive particularly in terms of its memory requirements in FPGA. Second-order nonlinear function (SONF) is an appealing replacement for LUT due to its small memory requirement. Although there is a trade-off between accuracy and memory size. Taking the advantage of the aforementioned approaches, this thesis proposed a combination of SONF and a modified LUT namely differential lookup table (dLUT). The deviation values between SONF and sigmoid function are used to create the dLUT. SONF is used as the first step to approximate the sigmoid function. Then it is followed by adding or deducting with the value that has been stored in the dLUT as a second step as demonstrated via simulation. This combination has successfully reduced the deviation value. The reduction value is significant as compared to previous implementations such as SONF, and LUT itself. Further simulation has been carried out to evaluate the accuracy of the ANN in detecting the object in an indoor environment by using the proposed method as a sigmoid function. The result has proven that the proposed method has produced the output almost as accurately as software implementation in detecting the target in indoor positioning problems. Therefore, the proposed method can be applied in any field that demands higher processing and high accuracy in sigmoid function outpu

    Time series prediction and channel equalizer using artificial neural networks with VLSI implementation

    Get PDF
    The architecture and training procedure of a novel recurrent neural network (RNN), referred to as the multifeedbacklayer neural network (MFLNN), is described in this paper. The main difference of the proposed network compared to the available RNNs is that the temporal relations are provided by means of neurons arranged in three feedback layers, not by simple feedback elements, in order to enrich the representation capabilities of the recurrent networks. The feedback layers provide local and global recurrences via nonlinear processing elements. In these feedback layers, weighted sums of the delayed outputs of the hidden and of the output layers are passed through certain activation functions and applied to the feedforward neurons via adjustable weights. Both online and offline training procedures based on the backpropagation through time (BPTT) algorithm are developed. The adjoint model of the MFLNN is built to compute the derivatives with respect to the MFLNN weights which are then used in the training procedures. The Levenberg–Marquardt (LM) method with a trust region approach is used to update the MFLNN weights. The performance of the MFLNN is demonstrated by applying to several illustrative temporal problems including chaotic time series prediction and nonlinear dynamic system identification, and it performed better than several networks available in the literature

    Finding Unexpected Events in Staring Continuous-Dwell Sensor Data Streams Via Adaptive Prediction

    Get PDF
    This research produced a Predictive Anomaly Detector (PAD). It is an adaptive prediction-based approach to detecting unexpected events in data streams drawn from staring continuous-dwell sensors. The underlying technology is spectrum independent and does not depend on correlated data (neither temporal nor spatial) to achieve improved detection and extraction in highly robust environments. ( robust environment refers to the data stream\u27s control law being variable and the spectral content covering a wide range of wavelengths.) The resulting approach uses a network of simple building-block equations (basis functions) to predict the non-event data and thereby present subtle sub-streams to a detection model as potential events of interest. The prediction model is automatically created from sequential observations of the data stream. Once model construction is complete, it continues to evolve as new samples arrive. Each sample value that is sufficiently different from the model\u27s predicted value is postulated as an unexpected event. A subsequent detection model uses a set of rules to confirm unexpected events while ignoring outliers. Intruder detection in robust video scenes is the main focus, although one demonstration achieved voice detection in a noisy audio signal. These demonstrations are coupled to a concept of operations that emphasizes the spectrum-independence of this approach and its integration with other processing requirements such as target recognition and tracking. Primary benefits delivered by this work include the ability to process large data volumes for obscured or buried information within highly active environments. The fully automated nature of this technique helps mitigate manning shortfalls typically associated with sorting through large volumes of surveillance data using trained analysts. This approach enables an organization to perform automated cueing for these analysts so that they spend less time examining data where nothing of interest exists. This maximizes the value of skilled personnel by using them to assess data with true potential. In this way, larger data volumes can be processed in a shorter period of time leading to a higher likelihood that important events and signals will be found, analyzed, and acted upon

    Embedded Machine Learning: Emphasis on Hardware Accelerators and Approximate Computing for Tactile Data Processing

    Get PDF
    Machine Learning (ML) a subset of Artificial Intelligence (AI) is driving the industrial and technological revolution of the present and future. We envision a world with smart devices that are able to mimic human behavior (sense, process, and act) and perform tasks that at one time we thought could only be carried out by humans. The vision is to achieve such a level of intelligence with affordable, power-efficient, and fast hardware platforms. However, embedding machine learning algorithms in many application domains such as the internet of things (IoT), prostheses, robotics, and wearable devices is an ongoing challenge. A challenge that is controlled by the computational complexity of ML algorithms, the performance/availability of hardware platforms, and the application\u2019s budget (power constraint, real-time operation, etc.). In this dissertation, we focus on the design and implementation of efficient ML algorithms to handle the aforementioned challenges. First, we apply Approximate Computing Techniques (ACTs) to reduce the computational complexity of ML algorithms. Then, we design custom Hardware Accelerators to improve the performance of the implementation within a specified budget. Finally, a tactile data processing application is adopted for the validation of the proposed exact and approximate embedded machine learning accelerators. The dissertation starts with the introduction of the various ML algorithms used for tactile data processing. These algorithms are assessed in terms of their computational complexity and the available hardware platforms which could be used for implementation. Afterward, a survey on the existing approximate computing techniques and hardware accelerators design methodologies is presented. Based on the findings of the survey, an approach for applying algorithmic-level ACTs on machine learning algorithms is provided. Then three novel hardware accelerators are proposed: (1) k-Nearest Neighbor (kNN) based on a selection-based sorter, (2) Tensorial Support Vector Machine (TSVM) based on Shallow Neural Networks, and (3) Hybrid Precision Binary Convolution Neural Network (BCNN). The three accelerators offer a real-time classification with monumental reductions in the hardware resources and power consumption compared to existing implementations targeting the same tactile data processing application on FPGA. Moreover, the approximate accelerators maintain a high classification accuracy with a loss of at most 5%

    Algorithms and VLSI architectures for parametric additive synthesis

    Get PDF
    A parametric additive synthesis approach to sound synthesis is advantageous as it can model sounds in a large scale manner, unlike the classical sinusoidal additive based synthesis paradigms. It is known that a large body of naturally occurring sounds are resonant in character and thus fit the concept well. This thesis is concerned with the computational optimisation of a super class of form ant synthesis which extends the sinusoidal parameters with a spread parameter known as band width. Here a modified formant algorithm is introduced which can be traced back to work done at IRCAM, Paris. When impulse driven, a filter based approach to modelling a formant limits the computational work-load. It is assumed that the filter's coefficients are fixed at initialisation, thus avoiding interpolation which can cause the filter to become chaotic. A filter which is more complex than a second order section is required. Temporal resolution of an impulse generator is achieved by using a two stage polyphase decimator which drives many filterbanks. Each filterbank describes one formant and is composed of sub-elements which allow variation of the formant’s parameters. A resource manager is discussed to overcome the possibility of all sub- banks operating in unison. All filterbanks for one voice are connected in series to the impulse generator and their outputs are summed and scaled accordingly. An explorative study of number systems for DSP algorithms and their architectures is investigated. I invented a new theoretical mechanism for multi-level logic based DSP. Its aims are to reduce the number of transistors and to increase their functionality. A review of synthesis algorithms and VLSI architectures are discussed in a case study between a filter based bit-serial and a CORDIC based sinusoidal generator. They are both of similar size, but the latter is always guaranteed to be stable

    Central nervous system: overall considerations based on hardware realization of digital spiking silicon neurons (dssns) and synaptic coupling

    Get PDF
    The Central Nervous System (CNS) is the part of the nervous system including the brain and spinal cord. The CNS is so named because the brain integrates the received information and influences the activity of different sections of the bodies. The basic elements of this important organ are: neurons, synapses, and glias. Neuronal modeling approach and hardware realization design for the nervous system of the brain is an important issue in the case of reproducing the same biological neuronal behaviors. This work applies a quadratic-based modeling called Digital Spiking Silicon Neuron (DSSN) to propose a modified version of the neuronal model which is capable of imitating the basic behaviors of the original model. The proposed neuron is modeled based on the primary hyperbolic functions, which can be realized in high correlation state with the main model (original one). Really, if the high-cost terms of the original model, and its functions were removed, a low-error and high-performance (in case of frequency and speed-up) new model will be extracted compared to the original model. For testing and validating the new model in hardware state, Xilinx Spartan-3 FPGA board has been considered and used. Hardware results show the high-degree of similarity between the original and proposed models (in terms of neuronal behaviors) and also higher frequency and low-cost condition have been achieved. The implementation results show that the overall saving is more than other papers and also the original model. Moreover, frequency of the proposed neuronal model is about 168 MHz, which is significantly higher than the original model frequency, 63 MHz

    Hybrid Switch Reluctance Drives For Pump Applications

    Get PDF

    Signal processing architectures for automotive high-resolution MIMO radar systems

    Get PDF
    To date, the digital signal processing for an automotive radar sensor has been handled in an efficient way by general purpose signal processors and microcontrollers. However, increasing resolution requirements for automated driving on the one hand, as well as rapidly growing numbers of manufactured sensors on the other hand, can provoke a paradigm change in the near future. The design and development of highly specialized hardware accelerators could become a viable option - at least for the most demanding processing steps with data rates of several gigabits per second. In this work, application-specific signal processing architectures for future high-resolution multiple-input and multiple-output (MIMO) radar sensors are designed, implemented, investigated and optimized. A focus is set on real-time performance such that even sophisticated algorithms can be computed sufficiently fast. The full processing chain from the received baseband signals to a list of detections is considered, comprising three major steps: Spectrum analysis, target detection and direction of arrival estimation. The developed architectures are further implemented on a field-programmable gate array (FPGA) and important measurements like resource consumption, power dissipation or data throughput are evaluated and compared with other examples from literature. A substantial dataset, based on more than 3600 different parametrizations and variants, has been established with the help of a model-based design space exploration and is provided as part of this work. Finally, an experimental radar sensor has been built and is used under real-world conditions to verify the effectiveness of the proposed signal processing architectures.Bisher wurde die digitale Signalverarbeitung für automobile Radarsensoren auf eine effiziente Art und Weise von universell verwendbaren Mikroprozessoren bewältigt. Jedoch können steigende Anforderungen an das Auflösungsvermögen für hochautomatisiertes Fahren einerseits, sowie schnell wachsende Stückzahlen produzierter Sensoren andererseits, einen Paradigmenwechsel in naher Zukunft bewirken. Die Entwicklung von hochgradig spezialisierten Hardwarebeschleunigern könnte sich als eine praktikable Alternative etablieren - zumindest für die anspruchsvollsten Rechenschritte mit Datenraten von mehreren Gigabits pro Sekunde. In dieser Arbeit werden anwendungsspezifische Signalverarbeitungsarchitekturen für zukünftige, hochauflösende, MIMO Radarsensoren entworfen, realisiert, untersucht und optimiert. Der Fokus liegt dabei stets auf der Echtzeitfähigkeit, sodass selbst anspruchsvolle Algorithmen in einer ausreichend kurzen Zeit berechnet werden können. Die komplette Signalverarbeitungskette, beginnend von den empfangenen Signalen im Basisband bis hin zu einer Liste von Detektion, wird in dieser Arbeit behandelt. Die Kette gliedert sich im Wesentlichen in drei größere Teilschritte: Spektralanalyse, Zieldetektion und Winkelschätzung. Des Weiteren werden die entwickelten Architekturen auf einem FPGA implementiert und wichtige Kennzahlen wie Ressourcenverbrauch, Stromverbrauch oder Datendurchsatz ausgewertet und mit anderen Beispielen aus der Literatur verglichen. Ein umfangreicher Datensatz, welcher mehr als 3600 verschiedene Parametrisierungen und Varianten beinhaltet, wurde mit Hilfe einer modellbasierten Entwurfsraumexploration erstellt und ist in dieser Arbeit enthalten. Schließlich wurde ein experimenteller Radarsensor aufgebaut und dazu benutzt, die entworfenen Signalverarbeitungsarchitekturen unter realen Umgebungsbedingungen zu verifizieren
    corecore