151 research outputs found

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs

    FPGA Implementation of Blind Source Separation using FastICA

    Get PDF
    Fast Independent Component Analysis (FastICA) is a statistical method used to separate signals from an unknown mixture without any prior knowledge about the signals. This method has been used in many applications like the separation of fetal and maternal Electrocardiogram (ECG) for pregnant women. This thesis presents an implementation of a fixed-point FastICA in field programmable gate array (FPGA). The proposed design can separate up to four signals using four sensors. QR decomposition is used to improve the speed of evaluation of the eigenvalues and eigenvectors of the covariance matrix. Moreover, a symmetric orthogonalization of the unit estimation algorithm is implemented using an iterative technique to speed up the search algorithm for higher order data input. The hardware is implemented using Xilinx virtex5-XC5VLX50t chip. The proposed design can process 128 samples for the four sensors in less than 63 ns when the design is simulated using 10 MHz clock

    Using reconfigurable computing technology to accelerate matrix decomposition and applications

    Get PDF
    Matrix decomposition plays an increasingly significant role in many scientific and engineering applications. Among numerous techniques, Singular Value Decomposition (SVD) and Eigenvalue Decomposition (EVD) are widely used as factorization tools to perform Principal Component Analysis for dimensionality reduction and pattern recognition in image processing, text mining and wireless communications, while QR Decomposition (QRD) and sparse LU Decomposition (LUD) are employed to solve the dense or sparse linear system of equations in bioinformatics, power system and computer vision. Matrix decompositions are computationally expensive and their sequential implementations often fail to meet the requirements of many time-sensitive applications. The emergence of reconfigurable computing has provided a flexible and low-cost opportunity to pursue high-performance parallel designs, and the use of FPGAs has shown promise in accelerating this class of computation. In this research, we have proposed and implemented several highly parallel FPGA-based architectures to accelerate matrix decompositions and their applications in data mining and signal processing. Specifically, in this dissertation we describe the following contributions: • We propose an efficient FPGA-based double-precision floating-point architecture for EVD, which can efficiently analyze large-scale matrices. • We implement a floating-point Hestenes-Jacobi architecture for SVD, which is capable of analyzing arbitrary sized matrices. • We introduce a novel deeply pipelined reconfigurable architecture for QRD, which can be dynamically configured to perform either Householder transformation or Givens rotation in a manner that takes advantage of the strengths of each. • We design a configurable architecture for sparse LUD that supports both symmetric and asymmetric sparse matrices with arbitrary sparsity patterns. • By further extending the proposed hardware solution for SVD, we parallelize a popular text mining tool-Latent Semantic Indexing with an FPGA-based architecture. • We present a configurable architecture to accelerate Homotopy l1-minimization, in which the modification of the proposed FPGA architecture for sparse LUD is used at its core to parallelize both Cholesky decomposition and rank-1 update. Our experimental results using an FPGA-based acceleration system indicate the efficiency of our proposed novel architectures, with application and dimension-dependent speedups over an optimized software implementation that range from 1.5ÃÂ to 43.6ÃÂ in terms of computation time

    An efficient implementation of lattice-ladder multilayer perceptrons in field programmable gate arrays

    Get PDF
    The implementation efficiency of electronic systems is a combination of conflicting requirements, as increasing volumes of computations, accelerating the exchange of data, at the same time increasing energy consumption forcing the researchers not only to optimize the algorithm, but also to quickly implement in a specialized hardware. Therefore in this work, the problem of efficient and straightforward implementation of operating in a real-time electronic intelligent systems on field-programmable gate array (FPGA) is tackled. The object of research is specialized FPGA intellectual property (IP) cores that operate in a real-time. In the thesis the following main aspects of the research object are investigated: implementation criteria and techniques. The aim of the thesis is to optimize the FPGA implementation process of selected class dynamic artificial neural networks. In order to solve stated problem and reach the goal following main tasks of the thesis are formulated: rationalize the selection of a class of Lattice-Ladder Multi-Layer Perceptron (LLMLP) and its electronic intelligent system test-bed – a speaker dependent Lithuanian speech recognizer, to be created and investigated; develop dedicated technique for implementation of LLMLP class on FPGA that is based on specialized efficiency criteria for a circuitry synthesis; develop and experimentally affirm the efficiency of optimized FPGA IP cores used in Lithuanian speech recognizer. The dissertation contains: introduction, four chapters and general conclusions. The first chapter reveals the fundamental knowledge on computer-aideddesign, artificial neural networks and speech recognition implementation on FPGA. In the second chapter the efficiency criteria and technique of LLMLP IP cores implementation are proposed in order to make multi-objective optimization of throughput, LLMLP complexity and resource utilization. The data flow graphs are applied for optimization of LLMLP computations. The optimized neuron processing element is proposed. The IP cores for features extraction and comparison are developed for Lithuanian speech recognizer and analyzed in third chapter. The fourth chapter is devoted for experimental verification of developed numerous LLMLP IP cores. The experiments of isolated word recognition accuracy and speed for different speakers, signal to noise ratios, features extraction and accelerated comparison methods were performed. The main results of the thesis were published in 12 scientific publications: eight of them were printed in peer-reviewed scientific journals, four of them in a Thomson Reuters Web of Science database, four articles – in conference proceedings. The results were presented in 17 scientific conferences

    Development of Novel Independent Component Analysis Techniques and their Applications

    Get PDF
    Real world problems very often provide minimum information regarding their causes. This is mainly due to the system complexities and noninvasive techniques employed by scientists and engineers to study such systems. Signal and image processing techniques used for analyzing such systems essentially tend to be blind. Earlier, training signal based techniques were used extensively for such analyses. But many times either these training signals are not practicable to be availed by the analyzer or become burden on the system itself. Hence blind signal/image processing techniques are becoming predominant in modern real time systems. In fact, blind signal processing has become a very important topic of research and development in many areas, especially biomedical engineering, medical imaging, speech enhancement, remote sensing, communication systems, exploration seismology, geophysics, econometrics, data mining, sensor networks etc. Blind Signal Processing has three major areas: Blind Signal Separation and Extraction, Independent Component Analysis (ICA) and Multichannel Blind Deconvolution and Equalization. ICA technique has also been typically applied to the other two areas mentioned above. Hence ICA research with its wide range of applications is quite interesting and has been taken up as the central domain of the present work

    Algorithm and architecture for simultaneous diagonalization of matrices applied to subspace-based speech enhancement

    Get PDF
    This thesis presents algorithm and architecture for simultaneous diagonalization of matrices. As an example, a subspace-based speech enhancement problem is considered, where in the covariance matrices of the speech and noise are diagonalized simultaneously. In order to compare the system performance of the proposed algorithm, objective measurements of speech enhancement is shown in terms of the signal to noise ratio and mean bark spectral distortion at various noise levels. In addition, an innovative subband analysis technique for subspace-based time-domain constrained speech enhancement technique is proposed. The proposed technique analyses the signal in its subbands to build accurate estimates of the covariance matrices of speech and noise, exploiting the inherent low varying characteristics of speech and noise signals in narrow bands. The subband approach also decreases the computation time by reducing the order of the matrices to be simultaneously diagonalized. Simulation results indicate that the proposed technique performs well under extreme low signal-to-noise-ratio conditions. Further, an architecture is proposed to implement the simultaneous diagonalization scheme. The architecture is implemented on an FPGA primarily to compare the performance measures on hardware and the feasibility of the speech enhancement algorithm in terms of resource utilization, throughput, etc. A Xilinx FPGA is targeted for implementation. FPGA resource utilization re-enforces on the practicability of the design. Also a projection of the design feasibility for an ASIC implementation in terms of transistor count only is include

    Dirty RF Signal Processing for Mitigation of Receiver Front-end Non-linearity

    Get PDF
    Moderne drahtlose Kommunikationssysteme stellen hohe und teilweise gegensätzliche Anforderungen an die Hardware der Funkmodule, wie z.B. niedriger Energieverbrauch, große Bandbreite und hohe Linearität. Die Gewährleistung einer ausreichenden Linearität ist, neben anderen analogen Parametern, eine Herausforderung im praktischen Design der Funkmodule. Der Fokus der Dissertation liegt auf breitbandigen HF-Frontends für Software-konfigurierbare Funkmodule, die seit einigen Jahren kommerziell verfügbar sind. Die praktischen Herausforderungen und Grenzen solcher flexiblen Funkmodule offenbaren sich vor allem im realen Experiment. Eines der Hauptprobleme ist die Sicherstellung einer ausreichenden analogen Performanz über einen weiten Frequenzbereich. Aus einer Vielzahl an analogen Störeffekten behandelt die Arbeit die Analyse und Minderung von Nichtlinearitäten in Empfängern mit direkt-umsetzender Architektur. Im Vordergrund stehen dabei Signalverarbeitungsstrategien zur Minderung nichtlinear verursachter Interferenz - ein Algorithmus, der besser unter "Dirty RF"-Techniken bekannt ist. Ein digitales Verfahren nach der Vorwärtskopplung wird durch intensive Simulationen, Messungen und Implementierung in realer Hardware verifiziert. Um die Lücken zwischen Theorie und praktischer Anwendbarkeit zu schließen und das Verfahren in reale Funkmodule zu integrieren, werden verschiedene Untersuchungen durchgeführt. Hierzu wird ein erweitertes Verhaltensmodell entwickelt, das die Struktur direkt-umsetzender Empfänger am besten nachbildet und damit alle Verzerrungen im HF- und Basisband erfasst. Darüber hinaus wird die Leistungsfähigkeit des Algorithmus unter realen Funkkanal-Bedingungen untersucht. Zusätzlich folgt die Vorstellung einer ressourceneffizienten Echtzeit-Implementierung des Verfahrens auf einem FPGA. Abschließend diskutiert die Arbeit verschiedene Anwendungsfelder, darunter spektrales Sensing, robuster GSM-Empfang und GSM-basiertes Passivradar. Es wird gezeigt, dass nichtlineare Verzerrungen erfolgreich in der digitalen Domäne gemindert werden können, wodurch die Bitfehlerrate gestörter modulierter Signale sinkt und der Anteil nichtlinear verursachter Interferenz minimiert wird. Schließlich kann durch das Verfahren die effektive Linearität des HF-Frontends stark erhöht werden. Damit wird der zuverlässige Betrieb eines einfachen Funkmoduls unter dem Einfluss der Empfängernichtlinearität möglich. Aufgrund des flexiblen Designs ist der Algorithmus für breitbandige Empfänger universal einsetzbar und ist nicht auf Software-konfigurierbare Funkmodule beschränkt.Today's wireless communication systems place high requirements on the radio's hardware that are largely mutually exclusive, such as low power consumption, wide bandwidth, and high linearity. Achieving a sufficient linearity, among other analogue characteristics, is a challenging issue in practical transceiver design. The focus of this thesis is on wideband receiver RF front-ends for software defined radio technology, which became commercially available in the recent years. Practical challenges and limitations are being revealed in real-world experiments with these radios. One of the main problems is to ensure a sufficient RF performance of the front-end over a wide bandwidth. The thesis covers the analysis and mitigation of receiver non-linearity of typical direct-conversion receiver architectures, among other RF impairments. The main focus is on DSP-based algorithms for mitigating non-linearly induced interference, an approach also known as "Dirty RF" signal processing techniques. The conceived digital feedforward mitigation algorithm is verified through extensive simulations, RF measurements, and implementation in real hardware. Various studies are carried out that bridge the gap between theory and practical applicability of this approach, especially with the aim of integrating that technique into real devices. To this end, an advanced baseband behavioural model is developed that matches to direct-conversion receiver architectures as close as possible, and thus considers all generated distortions at RF and baseband. In addition, the algorithm's performance is verified under challenging fading conditions. Moreover, the thesis presents a resource-efficient real-time implementation of the proposed solution on an FPGA. Finally, different use cases are covered in the thesis that includes spectrum monitoring or sensing, GSM downlink reception, and GSM-based passive radar. It is shown that non-linear distortions can be successfully mitigated at system level in the digital domain, thereby decreasing the bit error rate of distorted modulated signals and reducing the amount of non-linearly induced interference. Finally, the effective linearity of the front-end is increased substantially. Thus, the proper operation of a low-cost radio under presence of receiver non-linearity is possible. Due to the flexible design, the algorithm is generally applicable for wideband receivers and is not restricted to software defined radios

    Design of large polyphase filters in the Quadratic Residue Number System

    Full text link

    Improving Maternal and Fetal Cardiac Monitoring Using Artificial Intelligence

    Get PDF
    Early diagnosis of possible risks in the physiological status of fetus and mother during pregnancy and delivery is critical and can reduce mortality and morbidity. For example, early detection of life-threatening congenital heart disease may increase survival rate and reduce morbidity while allowing parents to make informed decisions. To study cardiac function, a variety of signals are required to be collected. In practice, several heart monitoring methods, such as electrocardiogram (ECG) and photoplethysmography (PPG), are commonly performed. Although there are several methods for monitoring fetal and maternal health, research is currently underway to enhance the mobility, accuracy, automation, and noise resistance of these methods to be used extensively, even at home. Artificial Intelligence (AI) can help to design a precise and convenient monitoring system. To achieve the goals, the following objectives are defined in this research: The first step for a signal acquisition system is to obtain high-quality signals. As the first objective, a signal processing scheme is explored to improve the signal-to-noise ratio (SNR) of signals and extract the desired signal from a noisy one with negative SNR (i.e., power of noise is greater than signal). It is worth mentioning that ECG and PPG signals are sensitive to noise from a variety of sources, increasing the risk of misunderstanding and interfering with the diagnostic process. The noises typically arise from power line interference, white noise, electrode contact noise, muscle contraction, baseline wandering, instrument noise, motion artifacts, electrosurgical noise. Even a slight variation in the obtained ECG waveform can impair the understanding of the patient's heart condition and affect the treatment procedure. Recent solutions, such as adaptive and blind source separation (BSS) algorithms, still have drawbacks, such as the need for noise or desired signal model, tuning and calibration, and inefficiency when dealing with excessively noisy signals. Therefore, the final goal of this step is to develop a robust algorithm that can estimate noise, even when SNR is negative, using the BSS method and remove it based on an adaptive filter. The second objective is defined for monitoring maternal and fetal ECG. Previous methods that were non-invasive used maternal abdominal ECG (MECG) for extracting fetal ECG (FECG). These methods need to be calibrated to generalize well. In other words, for each new subject, a calibration with a trustable device is required, which makes it difficult and time-consuming. The calibration is also susceptible to errors. We explore deep learning (DL) models for domain mapping, such as Cycle-Consistent Adversarial Networks, to map MECG to fetal ECG (FECG) and vice versa. The advantages of the proposed DL method over state-of-the-art approaches, such as adaptive filters or blind source separation, are that the proposed method is generalized well on unseen subjects. Moreover, it does not need calibration and is not sensitive to the heart rate variability of mother and fetal; it can also handle low signal-to-noise ratio (SNR) conditions. Thirdly, AI-based system that can measure continuous systolic blood pressure (SBP) and diastolic blood pressure (DBP) with minimum electrode requirements is explored. The most common method of measuring blood pressure is using cuff-based equipment, which cannot monitor blood pressure continuously, requires calibration, and is difficult to use. Other solutions use a synchronized ECG and PPG combination, which is still inconvenient and challenging to synchronize. The proposed method overcomes those issues and only uses PPG signal, comparing to other solutions. Using only PPG for blood pressure is more convenient since it is only one electrode on the finger where its acquisition is more resilient against error due to movement. The fourth objective is to detect anomalies on FECG data. The requirement of thousands of manually annotated samples is a concern for state-of-the-art detection systems, especially for fetal ECG (FECG), where there are few publicly available FECG datasets annotated for each FECG beat. Therefore, we will utilize active learning and transfer-learning concept to train a FECG anomaly detection system with the least training samples and high accuracy. In this part, a model is trained for detecting ECG anomalies in adults. Later this model is trained to detect anomalies on FECG. We only select more influential samples from the training set for training, which leads to training with the least effort. Because of physician shortages and rural geography, pregnant women's ability to get prenatal care might be improved through remote monitoring, especially when access to prenatal care is limited. Increased compliance with prenatal treatment and linked care amongst various providers are two possible benefits of remote monitoring. If recorded signals are transmitted correctly, maternal and fetal remote monitoring can be effective. Therefore, the last objective is to design a compression algorithm that can compress signals (like ECG) with a higher ratio than state-of-the-art and perform decompression fast without distortion. The proposed compression is fast thanks to the time domain B-Spline approach, and compressed data can be used for visualization and monitoring without decompression owing to the B-spline properties. Moreover, the stochastic optimization is designed to retain the signal quality and does not distort signal for diagnosis purposes while having a high compression ratio. In summary, components for creating an end-to-end system for day-to-day maternal and fetal cardiac monitoring can be envisioned as a mix of all tasks listed above. PPG and ECG recorded from the mother can be denoised using deconvolution strategy. Then, compression can be employed for transmitting signal. The trained CycleGAN model can be used for extracting FECG from MECG. Then, trained model using active transfer learning can detect anomaly on both MECG and FECG. Simultaneously, maternal BP is retrieved from the PPG signal. This information can be used for monitoring the cardiac status of mother and fetus, and also can be used for filling reports such as partogram

    Parallel and Distributed Computing

    Get PDF
    The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing
    corecore