2,165 research outputs found

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Improving Maternal and Fetal Cardiac Monitoring Using Artificial Intelligence

    Get PDF
    Early diagnosis of possible risks in the physiological status of fetus and mother during pregnancy and delivery is critical and can reduce mortality and morbidity. For example, early detection of life-threatening congenital heart disease may increase survival rate and reduce morbidity while allowing parents to make informed decisions. To study cardiac function, a variety of signals are required to be collected. In practice, several heart monitoring methods, such as electrocardiogram (ECG) and photoplethysmography (PPG), are commonly performed. Although there are several methods for monitoring fetal and maternal health, research is currently underway to enhance the mobility, accuracy, automation, and noise resistance of these methods to be used extensively, even at home. Artificial Intelligence (AI) can help to design a precise and convenient monitoring system. To achieve the goals, the following objectives are defined in this research: The first step for a signal acquisition system is to obtain high-quality signals. As the first objective, a signal processing scheme is explored to improve the signal-to-noise ratio (SNR) of signals and extract the desired signal from a noisy one with negative SNR (i.e., power of noise is greater than signal). It is worth mentioning that ECG and PPG signals are sensitive to noise from a variety of sources, increasing the risk of misunderstanding and interfering with the diagnostic process. The noises typically arise from power line interference, white noise, electrode contact noise, muscle contraction, baseline wandering, instrument noise, motion artifacts, electrosurgical noise. Even a slight variation in the obtained ECG waveform can impair the understanding of the patient's heart condition and affect the treatment procedure. Recent solutions, such as adaptive and blind source separation (BSS) algorithms, still have drawbacks, such as the need for noise or desired signal model, tuning and calibration, and inefficiency when dealing with excessively noisy signals. Therefore, the final goal of this step is to develop a robust algorithm that can estimate noise, even when SNR is negative, using the BSS method and remove it based on an adaptive filter. The second objective is defined for monitoring maternal and fetal ECG. Previous methods that were non-invasive used maternal abdominal ECG (MECG) for extracting fetal ECG (FECG). These methods need to be calibrated to generalize well. In other words, for each new subject, a calibration with a trustable device is required, which makes it difficult and time-consuming. The calibration is also susceptible to errors. We explore deep learning (DL) models for domain mapping, such as Cycle-Consistent Adversarial Networks, to map MECG to fetal ECG (FECG) and vice versa. The advantages of the proposed DL method over state-of-the-art approaches, such as adaptive filters or blind source separation, are that the proposed method is generalized well on unseen subjects. Moreover, it does not need calibration and is not sensitive to the heart rate variability of mother and fetal; it can also handle low signal-to-noise ratio (SNR) conditions. Thirdly, AI-based system that can measure continuous systolic blood pressure (SBP) and diastolic blood pressure (DBP) with minimum electrode requirements is explored. The most common method of measuring blood pressure is using cuff-based equipment, which cannot monitor blood pressure continuously, requires calibration, and is difficult to use. Other solutions use a synchronized ECG and PPG combination, which is still inconvenient and challenging to synchronize. The proposed method overcomes those issues and only uses PPG signal, comparing to other solutions. Using only PPG for blood pressure is more convenient since it is only one electrode on the finger where its acquisition is more resilient against error due to movement. The fourth objective is to detect anomalies on FECG data. The requirement of thousands of manually annotated samples is a concern for state-of-the-art detection systems, especially for fetal ECG (FECG), where there are few publicly available FECG datasets annotated for each FECG beat. Therefore, we will utilize active learning and transfer-learning concept to train a FECG anomaly detection system with the least training samples and high accuracy. In this part, a model is trained for detecting ECG anomalies in adults. Later this model is trained to detect anomalies on FECG. We only select more influential samples from the training set for training, which leads to training with the least effort. Because of physician shortages and rural geography, pregnant women's ability to get prenatal care might be improved through remote monitoring, especially when access to prenatal care is limited. Increased compliance with prenatal treatment and linked care amongst various providers are two possible benefits of remote monitoring. If recorded signals are transmitted correctly, maternal and fetal remote monitoring can be effective. Therefore, the last objective is to design a compression algorithm that can compress signals (like ECG) with a higher ratio than state-of-the-art and perform decompression fast without distortion. The proposed compression is fast thanks to the time domain B-Spline approach, and compressed data can be used for visualization and monitoring without decompression owing to the B-spline properties. Moreover, the stochastic optimization is designed to retain the signal quality and does not distort signal for diagnosis purposes while having a high compression ratio. In summary, components for creating an end-to-end system for day-to-day maternal and fetal cardiac monitoring can be envisioned as a mix of all tasks listed above. PPG and ECG recorded from the mother can be denoised using deconvolution strategy. Then, compression can be employed for transmitting signal. The trained CycleGAN model can be used for extracting FECG from MECG. Then, trained model using active transfer learning can detect anomaly on both MECG and FECG. Simultaneously, maternal BP is retrieved from the PPG signal. This information can be used for monitoring the cardiac status of mother and fetus, and also can be used for filling reports such as partogram

    Toward sparse and geometry adapted video approximations

    Get PDF
    Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion information) as well as spatial geometry. Clearly, most of past and present strategies used to represent video signals do not exploit properly its spatial geometry. Similarly to the case of images, a very interesting approach seems to be the decomposition of video using large over-complete libraries of basis functions able to represent salient geometric features of the signal. In the framework of video, these features should model 2D geometric video components as well as their temporal evolution, forming spatio-temporal 3D geometric primitives. Through this PhD dissertation, different aspects on the use of adaptivity in video representation are studied looking toward exploiting both aspects of video: its piecewise nature and the geometry. The first part of this work studies the use of localized temporal adaptivity in subband video coding. This is done considering two transformation schemes used for video coding: 3D wavelet representations and motion compensated temporal filtering. A theoretical R-D analysis as well as empirical results demonstrate how temporal adaptivity improves coding performance of moving edges in 3D transform (without motion compensation) based video coding. Adaptivity allows, at the same time, to equally exploit redundancy in non-moving video areas. The analogy between motion compensated video and 1D piecewise-smooth signals is studied as well. This motivates the introduction of local length adaptivity within frame-adaptive motion compensated lifted wavelet decompositions. This allows an optimal rate-distortion performance when video motion trajectories are shorter than the transformation "Group Of Pictures", or when efficient motion compensation can not be ensured. After studying temporal adaptivity, the second part of this thesis is dedicated to understand the fundamentals of how can temporal and spatial geometry be jointly exploited. This work builds on some previous results that considered the representation of spatial geometry in video (but not temporal, i.e, without motion). In order to obtain flexible and efficient (sparse) signal representations, using redundant dictionaries, the use of highly non-linear decomposition algorithms, like Matching Pursuit, is required. General signal representation using these techniques is still quite unexplored. For this reason, previous to the study of video representation, some aspects of non-linear decomposition algorithms and the efficient decomposition of images using Matching Pursuits and a geometric dictionary are investigated. A part of this investigation concerns the study on the influence of using a priori models within approximation non-linear algorithms. Dictionaries with a high internal coherence have some problems to obtain optimally sparse signal representations when used with Matching Pursuits. It is proved, theoretically and empirically, that inserting in this algorithm a priori models allows to improve the capacity to obtain sparse signal approximations, mainly when coherent dictionaries are used. Another point discussed in this preliminary study, on the use of Matching Pursuits, concerns the approach used in this work for the decompositions of video frames and images. The technique proposed in this thesis improves a previous work, where authors had to recur to sub-optimal Matching Pursuit strategies (using Genetic Algorithms), given the size of the functions library. In this work the use of full search strategies is made possible, at the same time that approximation efficiency is significantly improved and computational complexity is reduced. Finally, a priori based Matching Pursuit geometric decompositions are investigated for geometric video representations. Regularity constraints are taken into account to recover the temporal evolution of spatial geometric signal components. The results obtained for coding and multi-modal (audio-visual) signal analysis, clarify many unknowns and show to be promising, encouraging to prosecute research on the subject

    High mobility in OFDM based wireless communication systems

    Get PDF
    Orthogonal Frequency Division Multiplexing (OFDM) has been adopted as the transmission scheme in most of the wireless systems we use on a daily basis. It brings with it several inherent advantages that make it an ideal waveform candidate in the physical layer. However, OFDM based wireless systems are severely affected in High Mobility scenarios. In this thesis, we investigate the effects of mobility on OFDM based wireless systems and develop novel techniques to estimate the channel and compensate its effects at the receiver. Compressed Sensing (CS) based channel estimation techniques like the Rake Matching Pursuit (RMP) and the Gradient Rake Matching Pursuit (GRMP) are developed to estimate the channel in a precise, robust and computationally efficient manner. In addition to this, a Cognitive Framework that can detect the mobility in the channel and configure an optimal estimation scheme is also developed and tested. The Cognitive Framework ensures a computationally optimal channel estimation scheme in all channel conditions. We also demonstrate that the proposed schemes can be adapted to other wireless standards easily. Accordingly, evaluation is done for three current broadcast, broadband and cellular standards. The results show the clear benefit of the proposed schemes in enabling high mobility in OFDM based wireless communication systems.Orthogonal Frequency Division Multiplexing (OFDM) wurde als Übertragungsschema in die meisten drahtlosen Systemen, die wir tĂ€glich verwenden, ĂŒbernommen. Es bringt mehrere inhĂ€rente Vorteile mit sich, die es zu einem idealen Waveform-Kandidaten in der BitĂŒbertragungsschicht (Physical Layer) machen. Allerdings sind OFDM-basierte drahtlose Systeme in Szenarien mit hoher MobilitĂ€t stark beeintrĂ€chtigt. In dieser Arbeit untersuchen wir die Auswirkungen der MobilitĂ€t auf OFDM-basierte drahtlose Systeme und entwickeln neuartige Techniken, um das Verhalten des Kanals abzuschĂ€tzen und seine Auswirkungen am EmpfĂ€nger zu kompensieren. Auf Compressed Sensing (CS) basierende KanalschĂ€tzverfahren wie das Rake Matching Pursuit (RMP) und das Gradient Rake Matching Pursuit (GRMP) werden entwickelt, um den Kanal prĂ€zise, robust und rechnerisch effizient abzuschĂ€tzen. DarĂŒber hinaus wird ein Cognitive Framework entwickelt und getestet, das die MobilitĂ€t im Kanal erkennt und ein optimales SchĂ€tzungsschema konfiguriert. Das Cognitive Framework gewĂ€hrleistet ein rechnerisch optimales KanalschĂ€tzungsschema fĂŒr alle möglichen Kanalbedingungen. Wir zeigen außerdem, dass die vorgeschlagenen Schemata auch leicht an andere Funkstandards angepasst werden können. Dementsprechend wird eine Evaluierung fĂŒr drei aktuelle Rundfunk-, Breitband- und Mobilfunkstandards durchgefĂŒhrt. Die Ergebnisse zeigen den klaren Vorteil der vorgeschlagenen Schemata bei der Ermöglichung hoher MobilitĂ€t in OFDM-basierten drahtlosen Kommunikationssystemen

    30th International Conference on Condition Monitoring and Diagnostic Engineering Management (COMADEM 2017)

    Get PDF
    Proceedings of COMADEM 201

    Machine Learning and Data Mining Applications in Power Systems

    Get PDF
    This Special Issue was intended as a forum to advance research and apply machine-learning and data-mining methods to facilitate the development of modern electric power systems, grids and devices, and smart grids and protection devices, as well as to develop tools for more accurate and efficient power system analysis. Conventional signal processing is no longer adequate to extract all the relevant information from distorted signals through filtering, estimation, and detection to facilitate decision-making and control actions. Machine learning algorithms, optimization techniques and efficient numerical algorithms, distributed signal processing, machine learning, data-mining statistical signal detection, and estimation may help to solve contemporary challenges in modern power systems. The increased use of digital information and control technology can improve the grid’s reliability, security, and efficiency; the dynamic optimization of grid operations; demand response; the incorporation of demand-side resources and integration of energy-efficient resources; distribution automation; and the integration of smart appliances and consumer devices. Signal processing offers the tools needed to convert measurement data to information, and to transform information into actionable intelligence. This Special Issue includes fifteen articles, authored by international research teams from several countries
    • 

    corecore