17 research outputs found

    A survey on artificial intelligence-based acoustic source identification

    Get PDF
    The concept of Acoustic Source Identification (ASI), which refers to the process of identifying noise sources has attracted increasing attention in recent years. The ASI technology can be used for surveillance, monitoring, and maintenance applications in a wide range of sectors, such as defence, manufacturing, healthcare, and agriculture. Acoustic signature analysis and pattern recognition remain the core technologies for noise source identification. Manual identification of acoustic signatures, however, has become increasingly challenging as dataset sizes grow. As a result, the use of Artificial Intelligence (AI) techniques for identifying noise sources has become increasingly relevant and useful. In this paper, we provide a comprehensive review of AI-based acoustic source identification techniques. We analyze the strengths and weaknesses of AI-based ASI processes and associated methods proposed by researchers in the literature. Additionally, we did a detailed survey of ASI applications in machinery, underwater applications, environment/event source recognition, healthcare, and other fields. We also highlight relevant research directions

    Articulatory Copy Synthesis Based on the Speech Synthesizer VocalTractLab

    Get PDF
    Articulatory copy synthesis (ACS), a subarea of speech inversion, refers to the reproduction of natural utterances and involves both the physiological articulatory processes and their corresponding acoustic results. This thesis proposes two novel methods for the ACS of human speech using the articulatory speech synthesizer VocalTractLab (VTL) to address or mitigate the existing problems of speech inversion, such as non-unique mapping, acoustic variation among different speakers, and the time-consuming nature of the process. The first method involved finding appropriate VTL gestural scores for given natural utterances using a genetic algorithm. It consisted of two steps: gestural score initialization and optimization. In the first step, gestural scores were initialized using the given acoustic signals with speech recognition, grapheme-to-phoneme (G2P), and a VTL rule-based method for converting phoneme sequences to gestural scores. In the second step, the initial gestural scores were optimized by a genetic algorithm via an analysis-by-synthesis (ABS) procedure that sought to minimize the cosine distance between the acoustic features of the synthetic and natural utterances. The articulatory parameters were also regularized during the optimization process to restrict them to reasonable values. The second method was based on long short-term memory (LSTM) and convolutional neural networks, which were responsible for capturing the temporal dependence and the spatial structure of the acoustic features, respectively. The neural network regression models were trained, which used acoustic features as inputs and produced articulatory trajectories as outputs. In addition, to cover as much of the articulatory and acoustic space as possible, the training samples were augmented by manipulating the phonation type, speaking effort, and the vocal tract length of the synthetic utterances. Furthermore, two regularization methods were proposed: one based on the smoothness loss of articulatory trajectories and another based on the acoustic loss between original and predicted acoustic features. The best-performing genetic algorithms and convolutional LSTM systems (evaluated in terms of the difference between the estimated and reference VTL articulatory parameters) obtained average correlation coefficients of 0.985 and 0.983 for speaker-dependent utterances, respectively, and their reproduced speech achieved recognition accuracies of 86.25% and 64.69% for speaker-independent utterances of German words, respectively. When applied to German sentence utterances, as well as English and Mandarin Chinese word utterances, the neural network based ACS systems achieved recognition accuracies of 73.88%, 52.92%, and 52.41%, respectively. The results showed that both of these methods not only reproduced the articulatory processes but also reproduced the acoustic signals of reference utterances. Moreover, the regularization methods led to more physiologically plausible articulatory processes and made the estimated articulatory trajectories be more articulatorily preferred by VTL, thus reproducing more natural and intelligible speech. This study also found that the convolutional layers, when used in conjunction with batch normalization layers, automatically learned more distinctive features from log power spectrograms. Furthermore, the neural network based ACS systems trained using German data could be generalized to the utterances of other languages

    Algoritmos de Enjambre para la Optimizaci贸n de HMM en la Detecci贸n de Soplos Card铆acos en Se帽ales Fonocardiogr谩ficas Usando Representaciones Derivadas del An谩lisis de Vibraciones

    Get PDF
    Este trabajo presenta una metodolog铆a para desarrollar un sistema autom谩tico de apoyo en la clasificaci贸n de se帽ales fonocardiogr谩ficos (PCG). En primer lugar, las se帽ales PCG fueron pre-procesadas. Luego descompuestas por medio de la t茅cnica descomposici贸n modo emp铆rico (EMD) con algunas de sus variantes y el an谩lisis de vibraci贸n por descomposici贸n de Hilbert (HVD) de forma independiente, donde se compar贸 el cost贸 computacional y el error en la reconstrucci贸n de la se帽al original generando constructos a partir de las IMFs. A continuaci贸n, se extrajeron las caracter铆sticas con los momentos estad铆sticos de los datos generados por la transformada de Hilbert-Huang (HHT), adem谩s de los coeficientes cepstrales en las frecuencias de Mel (MFCC) y cuatro de sus variantes. Por 煤ltimo, un subconjunto de caracter铆sticas fue seleccionado usando conjuntos de aproximaci贸n difusos (FRS), an谩lisis de componentes principales (PCA) y selecci贸n secuencial flotante hacia adelante (SFFS) de manera simult谩nea para ser utilizadas como entradas del modelo oculto de Markov (HMM) erg贸dico ajustado con optimizaci贸n por enjambre de part铆culas (PSO), con el fin de proporcionar un mecanismo objetivo y preciso para mejorar la fiabilidad en la detecci贸n de soplos en el coraz贸n, obteniendo resultados en la clasificaci贸n de alrededor del 96% con valores de sensibilidad superiores a 0.8 y de especificidad mayores a 0.9, utilizando validaci贸n cruzada (70/30 con 30 fold)This study presents a methodology for developing an automated support system in the classification of phonographic signals (PCG). First, the PCG signals were preprocessed. You then decomposed by the decomposition technique empirically (EMD) with some of its variants and vibration analysis by decomposition of Hilbert (HVD) independently, where the computational cost and the error was compared in the reconstruction of the original signal generating constructs from IMFs. Then the characteristics of the statistical moments data generated by the Hilbert-Huang Transform (HHT), plus cepstral coeffcients at frequencies of Mel (MFCC) and four of its variants were extracted. Finally, a subset of features was selected using sets of fuzzy approximation (FRS), principal component analysis (PCA) and floating sequential forward selection (SFFS) simultaneously to be used as inputs to the hidden Markov model (HMM) ergodic adjusted particle swarm optimization (PSO), in order to provide an objective and accurate to improve reliability in detecting heart murmurs mechanism, obtaining results in the classification of about 96% with sensitivity values higher 0.8 and higher specificity to 0.9, using cross-validation (70/30 split with 30 fold)Magister en Automatizaci贸n y Contro

    Remote Sensing of the Oceans

    Get PDF
    This book covers different topics in the framework of remote sensing of the oceans. Latest research advancements and brand-new studies are presented that address the exploitation of remote sensing instruments and simulation tools to improve the understanding of ocean processes and enable cutting-edge applications with the aim of preserving the ocean environment and supporting the blue economy. Hence, this book provides a reference framework for state-of-the-art remote sensing methods that deal with the generation of added-value products and the geophysical information retrieval in related fields, including: Oil spill detection and discrimination; Analysis of tropical cyclones and sea echoes; Shoreline and aquaculture area extraction; Monitoring coastal marine litter and moving vessels; Processing of SAR, HF radar and UAV measurements

    Signal Processing Using Non-invasive Physiological Sensors

    Get PDF
    Non-invasive biomedical sensors for monitoring physiological parameters from the human body for potential future therapies and healthcare solutions. Today, a critical factor in providing a cost-effective healthcare system is improving patients' quality of life and mobility, which can be achieved by developing non-invasive sensor systems, which can then be deployed in point of care, used at home or integrated into wearable devices for long-term data collection. Another factor that plays an integral part in a cost-effective healthcare system is the signal processing of the data recorded with non-invasive biomedical sensors. In this book, we aimed to attract researchers who are interested in the application of signal processing methods to different biomedical signals, such as an electroencephalogram (EEG), electromyogram (EMG), functional near-infrared spectroscopy (fNIRS), electrocardiogram (ECG), galvanic skin response, pulse oximetry, photoplethysmogram (PPG), etc. We encouraged new signal processing methods or the use of existing signal processing methods for its novel application in physiological signals to help healthcare providers make better decisions
    corecore