51 research outputs found

    Ecosystem Monitoring and Port Surveillance Systems

    No full text
    International audienceIn this project, we should build up a novel system able to perform a sustainable and long term monitoring coastal marine ecosystems and enhance port surveillance capability. The outcomes will be based on the analysis, classification and the fusion of a variety of heterogeneous data collected using different sensors (hydrophones, sonars, various camera types, etc). This manuscript introduces the identified approaches and the system structure. In addition, it focuses on developed techniques and concepts to deal with several problems related to our project. The new system will address the shortcomings of traditional approaches based on measuring environmental parameters which are expensive and fail to provide adequate large-scale monitoring. More efficient monitoring will also enable improved analysis of climate change, and provide knowledge informing the civil authority's economic relationship with its coastal marine ecosystems

    Advances in Sonar Technology

    Get PDF
    The demand to explore the largest and also one of the richest parts of our planet, the advances in signal processing promoted by an exponential growth in computation power and a thorough study of sound propagation in the underwater realm, have lead to remarkable advances in sonar technology in the last years.The work on hand is a sum of knowledge of several authors who contributed in various aspects of sonar technology. This book intends to give a broad overview of the advances in sonar technology of the last years that resulted from the research effort of the authors in both sonar systems and their applications. It is intended for scientist and engineers from a variety of backgrounds and even those that never had contact with sonar technology before will find an easy introduction with the topics and principles exposed here

    Subband beamforming with higher order statistics for distant speech recognition

    Get PDF
    This dissertation presents novel beamforming methods for distant speech recognition (DSR). Such techniques can relieve users from the necessity of putting on close talking microphones. DSR systems are useful in many applications such as humanoid robots, voice control systems for automobiles, automatic meeting transcription systems and so on. A main problem in DSR is that recognition performance is seriously degraded when a speaker is far from the microphones. In order to avoid the degradation, noise and reverberation should be removed from signals received with the microphones. Acoustic beamforming techniques have a potential to enhance speech from the far field with little distortion since they can maintain a distortionless constraint for a look direction. In beamforming, multiple signals propagating from a position are captured with multiple microphones. Typical conventional beamformers then adjust their weights so as to minimize the variance of their own outputs subject to a distortionless constraint in a look direction. The variance is the average of the second power (square) of the beamformer\u27s outputs. Accordingly, it is considered that the conventional beamformer uses second orderstatistics (SOS) of the beamformer\u27s outputs. The conventional beamforming techniques can effectively place a null on any source of interference. However, the desired signal is also canceled in reverberant environments, which is known as the signal cancellation problem. To avoid that problem, many algorithms have been developed. However, none of the algorithms can essentially solve the signal cancellation problem in reverberant environments. While many efforts have been made in order to overcome the signal cancellation problem in the field of acoustic beamforming, researchers have addressed another research issue with the microphone array, that is, blind source separation (BSS) [1]. The BSS techniques aim at separating sources from the mixture of signals without information about the geometry of the microphone array and positions of sources. It is achieved by multiplying an un-mixing matrix with input signals. The un-mixing matrix is constructed so that the outputs are stochastically independent. Measuring the stochastic independence of the signals is based on the theory of the independent component analysis (ICA) [1]. The field of ICA is based on the fact that distributions of information-bearing signals are not Gaussian and distributions of sums of various signals are close to Gaussian. There are two popular criteria for measuring the degree of the non-Gaussianity, namely, kurtosis and negentropy. As described in detail in this thesis, both criteria use more than the second moment. Accordingly, it is referred to as higher order statistics (HOS) in contrast to SOS. HOS is not considered in the field of acoustic beamforming well although Arai et al. showed the similarity between acoustic beamforming and BSS [2]. This thesis investigates new beamforming algorithms which take into consideration higher-order statistics (HOS). The new beamforming methods adjust the beamformer\u27s weights based on one of the following criteria: • minimum mutual information of the two beamformer\u27s outputs, • maximum negentropy of the beamformer\u27s outputs and • maximum kurtosis of the beamformer\u27s outputs. Those algorithms do not suffer from the signal cancellation, which is shown in this thesis. Notice that the new beamforming techniques can keep the distortionless constraint for the direction of interest in contrast to the BSS algorithms. The effectiveness of the new techniques is finally demonstrated through a series of distant automatic speech recognition experiments on real data recorded with real sensors unlike other work where signals artificially convolved with measured impulse responses are considered. Significant improvements are achieved by the beamforming algorithms proposed here.Diese Dissertation präsentiert neue Methoden zur Spracherkennung auf Entfernung. Mit diesen Methoden ist es möglich auf Nahbesprechungsmikrofone zu verzichten. Spracherkennungssysteme, die auf Nahbesprechungsmikrofone verzichten, sind in vielen Anwendungen nützlich, wie zum Beispiel bei Humanoiden-Robotern, in Voice Control Systemen für Autos oder bei automatischen Transcriptionssystemen von Meetings. Ein Hauptproblem in der Spracherkennung auf Entfernung ist, dass mit zunehmendem Abstand zwischen Sprecher und Mikrofon, die Genauigkeit der Spracherkennung stark abnimmt. Aus diesem Grund ist es elementar die Störungen, nämlich Hintergrundgeräusche, Hall und Echo, aus den Mikrofonsignalen herauszurechnen. Durch den Einsatz von mehreren Mikrofonen ist eine räumliche Trennung des Nutzsignals von den Störungen möglich. Diese Methode wird als akustisches Beamformen bezeichnet. Konventionelle akustische Beamformer passen ihre Gewichte so an, dass die Varianz des Ausgangssignals minimiert wird, wobei das Signal in "Blickrichtung" die Bedingung der Verzerrungsfreiheit erfüllen muss. Die Varianz ist definiert als das quadratische Mittel des Ausgangssignals.Somit werden bei konventionellen Beamformingmethoden Second-Order Statistics (SOS) des Ausgangssignals verwendet. Konventionelle Beamformer können Störquellen effizient unterdrücken, aber leider auch das Nutzsignal. Diese unerwünschte Unterdrückung des Nutzsignals wird im Englischen signal cancellation genannt und es wurden bereits viele Algorithmen entwickelt um dies zu vermeiden. Keiner dieser Algorithmen, jedoch, funktioniert effektiv in verhallter Umgebung. Eine weitere Methode das Nutzsignal von den Störungen zu trennen, diesesmal jedoch ohne die geometrische Information zu nutzen, wird Blind Source Separation (BSS) [1] genannt. Hierbei wird eine Matrixmultiplikation mit dem Eingangssignal durchgeführt. Die Matrix muss so konstruiert werden, dass die Ausgangssignale statistisch unabhängig voneinander sind. Die statistische Unabhängigkeit wird mit der Theorie der Independent Component Analysis (ICA) gemessen [1]. Die ICA nimmt an, dass informationstragende Signale, wie z.B. Sprache, nicht gaußverteilt sind, wohingegen die Summe der Signale, z.B. das Hintergrundrauschen, gaußverteilt sind. Es gibt zwei gängige Arten um den Grad der Nichtgaußverteilung zu bestimmen, Kurtosis und Negentropy. Wie in dieser Arbeit beschrieben, werden hierbei höhere Momente als das zweite verwendet und somit werden diese Methoden als Higher-Order Statistics (HOS) bezeichnet. Obwohl Arai et al. zeigten, dass sich Beamforming und BSS ähnlich sind, werden HOS beim akustischen Beamforming bisher nicht verwendet [2] und beruhen weiterhin auf SOS. In der hier vorliegenden Dissertation werden neue Beamformingalgorithmen entwickelt und evaluiert, die auf HOS basieren. Die neuen Beamformingmethoden passen ihre Gewichte anhand eines der folgenden Kriterien an: • Minimum Mutual Information zweier Beamformer Ausgangssignale • Maximum Negentropy der Beamformer Ausgangssignale und • Maximum Kurtosis der Beamformer Ausgangssignale. Es wird anhand von Spracherkennerexperimenten (gemessen in Wortfehlerrate) gezeigt, dass die hier entwickelten Beamformingtechniken auch erfolgreich Störquellen in verhallten Umgebungen unterdrücken, was ein klarer Vorteil gegenüber den herkömmlichen Methoden ist

    Applications of Blind Source Separation to the Magnetoencephalogram Background Activity in Alzheimer’s Disease

    Get PDF
    En esta Tesis Doctoral se ha analizado actividad basal de magnetoencefalograma (MEG) de 36 pacientes con la Enfermedad de Alzheimer (Alzheimer’s Disease, AD) y 26 sujetos de control de edad avanzada con técnicas de separación ciega de fuentes (Blind Source Separation, BSS). El objetivo era aplicar los métodos de BSS para ayudar en el análisis e interpretación de este tipo de actividad cerebral, prestando especial atención a la AD. El término BSS denota un conjunto de técnicas útiles para descomponer registros multicanal en las componentes que los dieron lugar. Cuatro diferentes aplicaciones han sido desarrolladas. Los resultados de esta Tesis Doctoral sugieren la utilidad de la BSS para ayudar en el procesado de la actividad basal de MEG y para identificar y caracterizar la AD.Departamento de Teoría de la Señal y Comunicaciones e Ingeniería Telemátic

    Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

    Get PDF
    Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness

    Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

    Get PDF
    Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness

    Multimodal Data Fusion: An Overview of Methods, Challenges and Prospects

    No full text
    International audienceIn various disciplines, information about the same phenomenon can be acquired from different types of detectors, at different conditions, in multiple experiments or subjects, among others. We use the term "modality" for each such acquisition framework. Due to the rich characteristics of natural phenomena, it is rare that a single modality provides complete knowledge of the phenomenon of interest. The increasing availability of several modalities reporting on the same system introduces new degrees of freedom, which raise questions beyond those related to exploiting each modality separately. As we argue, many of these questions, or "challenges" , are common to multiple domains. This paper deals with two key questions: "why we need data fusion" and "how we perform it". The first question is motivated by numerous examples in science and technology, followed by a mathematical framework that showcases some of the benefits that data fusion provides. In order to address the second question, "diversity" is introduced as a key concept, and a number of data-driven solutions based on matrix and tensor decompositions are discussed, emphasizing how they account for diversity across the datasets. The aim of this paper is to provide the reader, regardless of his or her community of origin, with a taste of the vastness of the field, the prospects and opportunities that it holds

    EMG-to-Speech: Direct Generation of Speech from Facial Electromyographic Signals

    Get PDF
    The general objective of this work is the design, implementation, improvement and evaluation of a system that uses surface electromyographic (EMG) signals and directly synthesizes an audible speech output: EMG-to-speech

    Statistical causality in the EEG for the study of cognitive functions in healthy and pathological brains

    Get PDF
    Understanding brain functions requires not only information about the spatial localization of neural activity, but also about the dynamic functional links between the involved groups of neurons, which do not work in an isolated way, but rather interact together through ingoing and outgoing connections. The work carried on during the three years of PhD course returns a methodological framework for the estimation of the causal brain connectivity and its validation on simulated and real datasets (EEG and pseudo-EEG) at scalp and source level. Important open issues like the selection of the best algorithms for the source reconstruction and for time-varying estimates were addressed. Moreover, after the application of such approaches on real datasets recorded from healthy subjects and post-stroke patients, we extracted neurophysiological indices describing in a stable and reliable way the properties of the brain circuits underlying different cognitive states in humans (attention, memory). More in detail: I defined and implemented a toolbox (SEED-G toolbox) able to provide a useful validation instrument addressed to researchers who conduct their activity in the field of brain connectivity estimation. It may have strong implication, especially in methodological advancements. It allows to test the ability of different estimators in increasingly less ideal conditions: low number of available samples and trials, high inter-trial variability (very realistic situations when patients are involved in protocols) or, again, time varying connectivity patterns to be estimate (where stationary hypothesis in wide sense failed). A first simulation study demonstrated the robustness and the accuracy of the PDC with respect to the inter-trials variability under a large range of conditions usually encountered in practice. The simulations carried on the time-varying algorithms allowed to highlight the performance of the existing methodologies in different conditions of signals amount and number of available trials. Moreover, the adaptation of the Kalman based algorithm (GLKF) I implemented, with the introduction of the preliminary estimation of the initial conditions for the algorithm, lead to significantly better performance. Another simulation study allowed to identify a tool combining source localization approaches and brain connectivity estimation able to provide accurate and reliable estimates as less as possible affected to the presence of spurious links due to the head volume conduction. The developed and tested methodologies were successfully applied on three real datasets. The first one was recorded from a group of healthy subjects performing an attention task that allowed to describe the brain circuit at scalp and source level related with three important attention functions: alerting, orienting and executive control. The second EEG dataset come from a group of healthy subjects performing a memory task. Also in this case, the approaches under investigation allowed to identify synthetic connectivity-based descriptors able to characterize the three main memory phases (encoding, storage and retrieval). For the last analysis I recorded EEG data from a group of stroke patients performing the same memory task before and after one month of cognitive rehabilitation. The promising results of this preliminary study showed the possibility to follow the changes observed at behavioural level by means of the introduced neurophysiological indices
    • …
    corecore