601 research outputs found

    Heterogeneous recognition of bioacoustic signals for human-machine interfaces

    No full text
    Human-machine interfaces (HMI) provide a communication pathway between man and machine. Not only do they augment existing pathways, they can substitute or even bypass these pathways where functional motor loss prevents the use of standard interfaces. This is especially important for individuals who rely on assistive technology in their everyday life. By utilising bioacoustic activity, it can lead to an assistive HMI concept which is unobtrusive, minimally disruptive and cosmetically appealing to the user. However, due to the complexity of the signals it remains relatively underexplored in the HMI field. This thesis investigates extracting and decoding volition from bioacoustic activity with the aim of generating real-time commands. The developed framework is a systemisation of various processing blocks enabling the mapping of continuous signals into M discrete classes. Class independent extraction efficiently detects and segments the continuous signals while class-specific extraction exemplifies each pattern set using a novel template creation process stable to permutations of the data set. These templates are utilised by a generalised single channel discrimination model, whereby each signal is template aligned prior to classification. The real-time decoding subsystem uses a multichannel heterogeneous ensemble architecture which fuses the output from a diverse set of these individual discrimination models. This enhances the classification performance by elevating both the sensitivity and specificity, with the increased specificity due to a natural rejection capacity based on a non-parametric majority vote. Such a strategy is useful when analysing signals which have diverse characteristics, false positives are prevalent and have strong consequences, and when there is limited training data available. The framework has been developed with generality in mind with wide applicability to a broad spectrum of biosignals. The processing system has been demonstrated on real-time decoding of tongue-movement ear pressure signals using both single and dual channel setups. This has included in-depth evaluation of these methods in both offline and online scenarios. During online evaluation, a stimulus based test methodology was devised, while representative interference was used to contaminate the decoding process in a relevant and real fashion. The results of this research provide a strong case for the utility of such techniques in real world applications of human-machine communication using impulsive bioacoustic signals and biosignals in general

    EEG Signal Processing in Motor Imagery Brain Computer Interfaces with Improved Covariance Estimators

    Get PDF
    Desde hace unos años hasta la actualidad, el desarrollo en el campo de los interfaces cerebro ordenador ha ido aumentando. Este aumento viene motivado por una serie de factores distintos. A medida que aumenta el conocimiento acerca del cerebro humano y como funciona (del que aún se conoce relativamente poco), van surgiendo nuevos avances en los sistemas BCI que, a su vez, sirven de motivación para que se investigue más acerca de este órgano. Además, los sistemas BCI abren una puerta para que cualquier persona pueda interactuar con su entorno independientemente de la discapacidad física que pueda tener, simplemente haciendo uso de sus pensamientos. Recientemente, la industria tecnológica ha comenzado a mostrar su interés por estos sistemas, motivados tanto por los avances con respecto a lo que conocemos del cerebro y como funciona, como por el uso constante que hacemos de la tecnología en la actuali- dad, ya sea a través de nuestros smartphones, tablets u ordenadores, entre otros muchos dispositivos. Esto motiva que compañías como Facebook inviertan en el desarrollo de sistemas BCI para que tanto personas sin discapacidad como aquellas que, si las tienen, puedan comunicarse con los móviles usando solo el cerebro. El trabajo desarrollado en esta tesis se centra en los sistemas BCI basados en movimien- tos imaginarios. Esto significa que el usuario piensa en movimientos motores que son interpretados por un ordenador como comandos. Las señales cerebrales necesarias para traducir posteriormente a comandos se obtienen mediante un equipo de EEG que se coloca sobre el cuero cabelludo y que mide la actividad electromagnética producida por el cere- bro. Trabajar con estas señales resulta complejo ya que son no estacionarias y, además, suelen estar muy contaminadas por ruido o artefactos. Hemos abordado esta temática desde el punto de vista del procesado estadístico de la señal y mediante algoritmos de aprendizaje máquina. Para ello se ha descompuesto el sistema BCI en tres bloques: preprocesado de la señal, extracción de características y clasificación. Tras revisar el estado del arte de estos bloques, se ha resumido y adjun- tado un conjunto de publicaciones que hemos realizado durante los últimos años, y en las cuales podemos encontrar las diferentes aportaciones que, desde nuestro punto de vista, mejoran cada uno de los bloques anteriormente mencionados. De manera muy resumida, para el bloque de preprocesado proponemos un método mediante el cual conseguimos nor- malizar las fuentes de las señales de EEG. Al igualar las fuentes efectivas conseguimos mejorar la estima de las matrices de covarianza. Con respecto al bloque de extracción de características, hemos conseguido extender el algoritmo CSP a casos no supervisados. Por último, en el bloque de clasificación también hemos conseguido realizar una sepa- ración de clases de manera no supervisada y, por otro lado, hemos observado una mejora cuando se regulariza el algoritmo LDA mediante un método específico para Gaussianas.The research and development in the field of Brain Computer Interfaces (BCI) has been growing during the last years, motivated by several factors. As the knowledge about how the human brain is and works (of which we still know very little) grows, new advances in BCI systems are emerging that, in turn, serve as motivation to do more re- search about this organ. In addition, BCI systems open a door for anyone to interact with their environment regardless of the physical disabilities they may have, by simply using their thoughts. Recently, the technology industry has begun to show its interest in these systems, mo- tivated both by the advances about what we know of the brain and how it works, and by the constant use we make of technology nowadays, whether it is by using our smart- phones, tablets or computers, among many other devices. This motivates companies like Facebook to invest in the development of BCI systems so that people (with or without disabilities) can communicate with their devices using only their brain. The work developed in this thesis focuses on BCI systems based on motor imagery movements. This means that the user thinks of certain motor movements that are in- terpreted by a computer as commands. The brain signals that we need to translate to commands are obtained by an EEG device that is placed on the scalp and measures the electromagnetic activity produced by the brain. Working with these signals is complex since they are non-stationary and, in addition, they are usually heavily contaminated by noise or artifacts. We have approached this subject from the point of view of statistical signal processing and through machine learning algorithms. For this, the BCI system has been split into three blocks: preprocessing, feature extraction and classification. After reviewing the state of the art of these blocks, a set of publications that we have made in recent years has been summarized and attached. In these publications we can find the different contribu- tions that, from our point of view, improve each one of the blocks previously mentioned. As a brief summary, for the preprocessing block we propose a method that lets us nor- malize the sources of the EEG signals. By equalizing the effective sources, we are able to improve the estimation of the covariance matrices. For the feature extraction block, we have managed to extend the CSP algorithm for unsupervised cases. Finally, in the classification block we have also managed to perform a separation of classes in an blind way and we have also observed an improvement when the LDA algorithm is regularized by a specific method for Gaussian distributions

    Automated online and real-time monitoring of environmental risk

    Get PDF
    Currently the environment degradation due to human activities created the need of monitoring the industrial waste waters, rivers and oceans for fast, on site and real-time pollutants detection. The present works intends to optimize a heavy metals detection system based on different Ion Selective Electrodes (ISE) implanted in a Sequential Injection Analysis (SIA) equipment for self-preparation of chemical patterns, calibration and sample analysis in a single equipment. The goal is to calibrate, optimize and automate different ISE implanted in a single unit, usually described as an Electronic Tongue (ET) in the literature, due to its comparison with biological tongue functionality. In this work the setup of the ISEs within an automated system, the calibration of the electrodes and the determination of metals was achieved successfully. Also, the improvement of quality signal is achieved by identifying the noise sources. The study of an online monitoring system is investigated. The Secure File Transfer Protocols (SFTP) are considered as a suitable option for online monitoring and real time analysis with the Risk Assessment Data Base (RAdb), developed by the Norwegian Institute for Water Research (NIVA)

    Development of EEG-based technologies for the characterization and treatment of neurological diseases affecting the motor function

    Get PDF
    This thesis presents a set of studies applying signal processing and data mining techniques in real-time working systems to register, characterize and condition the movement-related cortical activity of healthy subjects and of patients with neurological disorders affecting the motor function. Patients with two of the most widespread neurological affections impairing the motor function are considered here: patients with essential tremor and patients who have suffered a cerebro-vascular accident. The different chapters in the presented thesis show results regarding the normal cortical activity associated with the planning and execution of motor actions with the upper-limb, and the pathological activity related to the patients' motor dysfunction (measurable with muscle electrodes or movement sensors). The initial chapters of the book present i) a revision of the basic concepts regarding the role of the cerebral cortex in the motor control and the way in which the electroencephalographic activity allows its analysis and conditioning, ii) a study on the cortico-muscular interaction at the tremor frequency in patients with essential tremor under the effects of a drug reducing their tremor, and finally iii) a study based on evolutionary algorithms that aims to identify cortical patterns related to the planning of a number of motor tasks performed with a single arm. In the second half of the thesis book, two brain-computer interface systems to be used in rehabilitation scenarios with essential tremor patients and with patients with a stroke are proposed. In the first system, the electroencephalographic activity is used to anticipate voluntary movement actions, and this information is integrated in a multimodal platform estimating and suppressing the pathological tremors. In the second case, a conditioning paradigm for stroke patients based on the identification of the motor intention with temporal precision is presented and tested with a cohort of four patients along a month during which the patients undergo eight intervention sessions. The presented thesis has yielded advances from both the technological and the scientific points of view in all studies proposed. The main contributions from the technological point of view are: ¿ The design of an integrated upper-limb platform working in real-time. The platform was designed to acquire information from different types of noninvasive sensors (EEG, EMG and gyroscopic sensors) characterizing the planning and execution of voluntary movements. The platform was also capable of processing online the acquired data and generating an electrical feedback. ¿ The development of signal processing and classifying techniques adapted to the kind of signal recorded in the two kinds of patients considered in this thesis (patients with essential tremor and patients with a stroke) and to the requirements of online processing and real-time single-trial function desired for BCI applications. Especially in this regard, an original methodology to detect onsets of voluntary movements using slow cortical potentials and cortical rhythms has been presented. ¿ The design and validation in real-time of asynchronous BCI systems using motor planning EEG segments to anticipate or detect when patients begin a voluntary movement with the upper-limb. ¿ The proof of concept of the advantages of an EEG system integrated in a multimodal human-robot interface architecture that constitutes the first multimodal interface using the combined acquisition of EEG, EMG and gyroscopic data, which allows the concurrent characterization of different parts of the body associated with the execution of a movement. The main scientific contributions of this thesis are: ¿ The study of the EEG-based anticipation of voluntary movements presented in Chapter 5 of the thesis was the first demonstration (to the author's knowledge) of the capacity of the EEG signal to provide reliable movement predictions based on single-trial classification of online data of healthy subjects and ET patients. This study also provides, for the first time, the results of a BCI system tested in ET patients and it represents an original approach to BCI applications for this group of patients. ¿ It has been presented the first neurophysiological study using EEG and EMG data to analyze the effects of a drug on cortical activity and tremors of patients with ET. In addition, the obtained results have shown for the first time that a significant correlation exists between the dynamics of specific cortical oscillations and pathological tremor manifestation as a consequence of the drug effects. ¿ It has been proposed for the first time an experiment to inspect whether the EEG signal carries enough information to classify up to seven different tasks performed with a single limb. Both the methodology applied and the validation procedure are also innovative in this sort of studies. ¿ It has been demonstrated for the first time the relevance of combining different cortical sources of information (such as BP and ERD) to estimate the initiation of voluntary movements with the upper-limb. In this line, special relevance may be given to the positive results achieved with stroke patients, improving the results presented by similar previous EEG-based studies by other research groups. It has also been proposed for the first time an upper-limb intervention protocol for stroke patients using BP and ERD patterns to provide proprioceptive feedback tightly associated with the patients' expectations of movement. The effects of the proposed intervention have been studied with a small group of patients

    A Silent-Speech Interface using Electro-Optical Stomatography

    Get PDF
    Sprachtechnologie ist eine große und wachsende Industrie, die das Leben von technologieinteressierten Nutzern auf zahlreichen Wegen bereichert. Viele potenzielle Nutzer werden jedoch ausgeschlossen: Nämlich alle Sprecher, die nur schwer oder sogar gar nicht Sprache produzieren können. Silent-Speech Interfaces bieten einen Weg, mit Maschinen durch ein bequemes sprachgesteuertes Interface zu kommunizieren ohne dafür akustische Sprache zu benötigen. Sie können außerdem prinzipiell eine Ersatzstimme stellen, indem sie die intendierten Äußerungen, die der Nutzer nur still artikuliert, künstlich synthetisieren. Diese Dissertation stellt ein neues Silent-Speech Interface vor, das auf einem neu entwickelten Messsystem namens Elektro-Optischer Stomatografie und einem neuartigen parametrischen Vokaltraktmodell basiert, das die Echtzeitsynthese von Sprache basierend auf den gemessenen Daten ermöglicht. Mit der Hardware wurden Studien zur Einzelworterkennung durchgeführt, die den Stand der Technik in der intra- und inter-individuellen Genauigkeit erreichten und übertrafen. Darüber hinaus wurde eine Studie abgeschlossen, in der die Hardware zur Steuerung des Vokaltraktmodells in einer direkten Artikulation-zu-Sprache-Synthese verwendet wurde. Während die Verständlichkeit der Synthese von Vokalen sehr hoch eingeschätzt wurde, ist die Verständlichkeit von Konsonanten und kontinuierlicher Sprache sehr schlecht. Vielversprechende Möglichkeiten zur Verbesserung des Systems werden im Ausblick diskutiert.:Statement of authorship iii Abstract v List of Figures vii List of Tables xi Acronyms xiii 1. Introduction 1 1.1. The concept of a Silent-Speech Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Structure of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Fundamentals of phonetics 7 2.1. Components of the human speech production system . . . . . . . . . . . . . . . . . . . 7 2.2. Vowel sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Consonantal sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4. Acoustic properties of speech sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5. Coarticulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6. Phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7. Summary and implications for the design of a Silent-Speech Interface (SSI) . . . . . . . 21 3. Articulatory data acquisition techniques in Silent-Speech Interfaces 25 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2. Scope of the literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3. Video Recordings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4. Ultrasonography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5. Electromyography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6. Permanent-Magnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.7. Electromagnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.8. Radio waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.9. Palatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.10.Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4. Electro-Optical Stomatography 55 4.1. Contact sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2. Optical distance sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.3. Lip sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4. Sensor Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.5. Control Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.6. Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5. Articulation-to-Text 99 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.2. Command word recognition pilot study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3. Command word recognition small-scale study . . . . . . . . . . . . . . . . . . . . . . . . 102 6. Articulation-to-Speech 109 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.2. Articulatory synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.3. The six point vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.4. Objective evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 116 6.5. Perceptual evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 120 6.6. Direct synthesis using EOS to control the vocal tract model . . . . . . . . . . . . . . . . 125 6.7. Pitch and voicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7. Summary and outlook 145 7.1. Summary of the contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 7.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 A. Overview of the International Phonetic Alphabet 151 B. Mathematical proofs and derivations 153 B.1. Combinatoric calculations illustrating the reduction of possible syllables using phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 B.2. Signal Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 B.3. Effect of the contact sensor area on the conductance . . . . . . . . . . . . . . . . . . . . 155 B.4. Calculation of the forward current for the OP280V diode . . . . . . . . . . . . . . . . . . 155 C. Schematics and layouts 157 C.1. Schematics of the control unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 C.2. Layout of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.3. Bill of materials of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.4. Schematics of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 C.5. Layout of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 C.6. Bill of materials of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 D. Sensor unit assembly 169 E. Firmware flow and data protocol 177 F. Palate file format 181 G. Supplemental material regarding the vocal tract model 183 H. Articulation-to-Speech: Optimal hyperparameters 189 Bibliography 191Speech technology is a major and growing industry that enriches the lives of technologically-minded people in a number of ways. Many potential users are, however, excluded: Namely, all speakers who cannot easily or even at all produce speech. Silent-Speech Interfaces offer a way to communicate with a machine by a convenient speech recognition interface without the need for acoustic speech. They also can potentially provide a full replacement voice by synthesizing the intended utterances that are only silently articulated by the user. To that end, the speech movements need to be captured and mapped to either text or acoustic speech. This dissertation proposes a new Silent-Speech Interface based on a newly developed measurement technology called Electro-Optical Stomatography and a novel parametric vocal tract model to facilitate real-time speech synthesis based on the measured data. The hardware was used to conduct command word recognition studies reaching state-of-the-art intra- and inter-individual performance. Furthermore, a study on using the hardware to control the vocal tract model in a direct articulation-to-speech synthesis loop was also completed. While the intelligibility of synthesized vowels was high, the intelligibility of consonants and connected speech was quite poor. Promising ways to improve the system are discussed in the outlook.:Statement of authorship iii Abstract v List of Figures vii List of Tables xi Acronyms xiii 1. Introduction 1 1.1. The concept of a Silent-Speech Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Structure of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Fundamentals of phonetics 7 2.1. Components of the human speech production system . . . . . . . . . . . . . . . . . . . 7 2.2. Vowel sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Consonantal sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4. Acoustic properties of speech sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5. Coarticulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6. Phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7. Summary and implications for the design of a Silent-Speech Interface (SSI) . . . . . . . 21 3. Articulatory data acquisition techniques in Silent-Speech Interfaces 25 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2. Scope of the literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3. Video Recordings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4. Ultrasonography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5. Electromyography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6. Permanent-Magnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.7. Electromagnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.8. Radio waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.9. Palatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.10.Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4. Electro-Optical Stomatography 55 4.1. Contact sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2. Optical distance sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.3. Lip sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4. Sensor Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.5. Control Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.6. Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5. Articulation-to-Text 99 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.2. Command word recognition pilot study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3. Command word recognition small-scale study . . . . . . . . . . . . . . . . . . . . . . . . 102 6. Articulation-to-Speech 109 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.2. Articulatory synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.3. The six point vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.4. Objective evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 116 6.5. Perceptual evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 120 6.6. Direct synthesis using EOS to control the vocal tract model . . . . . . . . . . . . . . . . 125 6.7. Pitch and voicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7. Summary and outlook 145 7.1. Summary of the contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 7.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 A. Overview of the International Phonetic Alphabet 151 B. Mathematical proofs and derivations 153 B.1. Combinatoric calculations illustrating the reduction of possible syllables using phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 B.2. Signal Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 B.3. Effect of the contact sensor area on the conductance . . . . . . . . . . . . . . . . . . . . 155 B.4. Calculation of the forward current for the OP280V diode . . . . . . . . . . . . . . . . . . 155 C. Schematics and layouts 157 C.1. Schematics of the control unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 C.2. Layout of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.3. Bill of materials of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.4. Schematics of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 C.5. Layout of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 C.6. Bill of materials of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 D. Sensor unit assembly 169 E. Firmware flow and data protocol 177 F. Palate file format 181 G. Supplemental material regarding the vocal tract model 183 H. Articulation-to-Speech: Optimal hyperparameters 189 Bibliography 19

    Augmented Reality

    Get PDF
    Augmented Reality (AR) is a natural development from virtual reality (VR), which was developed several decades earlier. AR complements VR in many ways. Due to the advantages of the user being able to see both the real and virtual objects simultaneously, AR is far more intuitive, but it's not completely detached from human factors and other restrictions. AR doesn't consume as much time and effort in the applications because it's not required to construct the entire virtual scene and the environment. In this book, several new and emerging application areas of AR are presented and divided into three sections. The first section contains applications in outdoor and mobile AR, such as construction, restoration, security and surveillance. The second section deals with AR in medical, biological, and human bodies. The third and final section contains a number of new and useful applications in daily living and learning

    Navigation system based in motion tracking sensor for percutaneous renal access

    Get PDF
    Tese de Doutoramento em Engenharia BiomédicaMinimally-invasive kidney interventions are daily performed to diagnose and treat several renal diseases. Percutaneous renal access (PRA) is an essential but challenging stage for most of these procedures, since its outcome is directly linked to the physician’s ability to precisely visualize and reach the anatomical target. Nowadays, PRA is always guided with medical imaging assistance, most frequently using X-ray based imaging (e.g. fluoroscopy). Thus, radiation on the surgical theater represents a major risk to the medical team, where its exclusion from PRA has a direct impact diminishing the dose exposure on both patients and physicians. To solve the referred problems this thesis aims to develop a new hardware/software framework to intuitively and safely guide the surgeon during PRA planning and puncturing. In terms of surgical planning, a set of methodologies were developed to increase the certainty of reaching a specific target inside the kidney. The most relevant abdominal structures for PRA were automatically clustered into different 3D volumes. For that, primitive volumes were merged as a local optimization problem using the minimum description length principle and image statistical properties. A multi-volume Ray Cast method was then used to highlight each segmented volume. Results show that it is possible to detect all abdominal structures surrounding the kidney, with the ability to correctly estimate a virtual trajectory. Concerning the percutaneous puncturing stage, either an electromagnetic or optical solution were developed and tested in multiple in vitro, in vivo and ex vivo trials. The optical tracking solution aids in establishing the desired puncture site and choosing the best virtual puncture trajectory. However, this system required a line of sight to different optical markers placed at the needle base, limiting the accuracy when tracking inside the human body. Results show that the needle tip can deflect from its initial straight line trajectory with an error higher than 3 mm. Moreover, a complex registration procedure and initial setup is needed. On the other hand, a real-time electromagnetic tracking was developed. Hereto, a catheter was inserted trans-urethrally towards the renal target. This catheter has a position and orientation electromagnetic sensor on its tip that function as a real-time target locator. Then, a needle integrating a similar sensor is used. From the data provided by both sensors, one computes a virtual puncture trajectory, which is displayed in a 3D visualization software. In vivo tests showed a median renal and ureteral puncture times of 19 and 51 seconds, respectively (range 14 to 45 and 45 to 67 seconds). Such results represent a puncture time improvement between 75% and 85% when comparing to state of the art methods. 3D sound and vibrotactile feedback were also developed to provide additional information about the needle orientation. By using these kind of feedback, it was verified that the surgeon tends to follow a virtual puncture trajectory with a reduced amount of deviations from the ideal trajectory, being able to anticipate any movement even without looking to a monitor. Best results show that 3D sound sources were correctly identified 79.2 ± 8.1% of times with an average angulation error of 10.4º degrees. Vibration sources were accurately identified 91.1 ± 3.6% of times with an average angulation error of 8.0º degrees. Additionally to the EMT framework, three circular ultrasound transducers were built with a needle working channel. One explored different manufacture fabrication setups in terms of the piezoelectric materials, transducer construction, single vs. multi array configurations, backing and matching material design. The A-scan signals retrieved from each transducer were filtered and processed to automatically detect reflected echoes and to alert the surgeon when undesirable anatomical structures are in between the puncture path. The transducers were mapped in a water tank and tested in a study involving 45 phantoms. Results showed that the beam cross-sectional area oscillates around the ceramics radius and it was possible to automatically detect echo signals in phantoms with length higher than 80 mm. Hereupon, it is expected that the introduction of the proposed system on the PRA procedure, will allow to guide the surgeon through the optimal path towards the precise kidney target, increasing surgeon’s confidence and reducing complications (e.g. organ perforation) during PRA. Moreover, the developed framework has the potential to make the PRA free of radiation for both patient and surgeon and to broad the use of PRA to less specialized surgeons.Intervenções renais minimamente invasivas são realizadas diariamente para o tratamento e diagnóstico de várias doenças renais. O acesso renal percutâneo (ARP) é uma etapa essencial e desafiante na maior parte destes procedimentos. O seu resultado encontra-se diretamente relacionado com a capacidade do cirurgião visualizar e atingir com precisão o alvo anatómico. Hoje em dia, o ARP é sempre guiado com recurso a sistemas imagiológicos, na maior parte das vezes baseados em raios-X (p.e. a fluoroscopia). A radiação destes sistemas nas salas cirúrgicas representa um grande risco para a equipa médica, aonde a sua remoção levará a um impacto direto na diminuição da dose exposta aos pacientes e cirurgiões. De modo a resolver os problemas existentes, esta tese tem como objetivo o desenvolvimento de uma framework de hardware/software que permita, de forma intuitiva e segura, guiar o cirurgião durante o planeamento e punção do ARP. Em termos de planeamento, foi desenvolvido um conjunto de metodologias de modo a aumentar a eficácia com que o alvo anatómico é alcançado. As estruturas abdominais mais relevantes para o procedimento de ARP, foram automaticamente agrupadas em volumes 3D, através de um problema de optimização global com base no princípio de “minimum description length” e propriedades estatísticas da imagem. Por fim, um procedimento de Ray Cast, com múltiplas funções de transferência, foi utilizado para enfatizar as estruturas segmentadas. Os resultados mostram que é possível detetar todas as estruturas abdominais envolventes ao rim, com a capacidade para estimar corretamente uma trajetória virtual. No que diz respeito à fase de punção percutânea, foram testadas duas soluções de deteção de movimento (ótica e eletromagnética) em múltiplos ensaios in vitro, in vivo e ex vivo. A solução baseada em sensores óticos ajudou no cálculo do melhor ponto de punção e na definição da melhor trajetória a seguir. Contudo, este sistema necessita de uma linha de visão com diferentes marcadores óticos acoplados à base da agulha, limitando a precisão com que a agulha é detetada no interior do corpo humano. Os resultados indicam que a agulha pode sofrer deflexões à medida que vai sendo inserida, com erros superiores a 3 mm. Por outro lado, foi desenvolvida e testada uma solução com base em sensores eletromagnéticos. Para tal, um cateter que integra um sensor de posição e orientação na sua ponta, foi colocado por via trans-uretral junto do alvo renal. De seguida, uma agulha, integrando um sensor semelhante, é utilizada para a punção percutânea. A partir da diferença espacial de ambos os sensores, é possível gerar uma trajetória de punção virtual. A mediana do tempo necessário para puncionar o rim e ureter, segundo esta trajetória, foi de 19 e 51 segundos, respetivamente (variações de 14 a 45 e 45 a 67 segundos). Estes resultados representam uma melhoria do tempo de punção entre 75% e 85%, quando comparados com o estado da arte dos métodos atuais. Além do feedback visual, som 3D e feedback vibratório foram explorados de modo a fornecer informações complementares da posição da agulha. Verificou-se que com este tipo de feedback, o cirurgião tende a seguir uma trajetória de punção com desvios mínimos, sendo igualmente capaz de antecipar qualquer movimento, mesmo sem olhar para o monitor. Fontes de som e vibração podem ser corretamente detetadas em 79,2 ± 8,1% e 91,1 ± 3,6%, com erros médios de angulação de 10.4º e 8.0 graus, respetivamente. Adicionalmente ao sistema de navegação, foram também produzidos três transdutores de ultrassom circulares com um canal de trabalho para a agulha. Para tal, foram exploradas diferentes configurações de fabricação em termos de materiais piezoelétricos, transdutores multi-array ou singulares e espessura/material de layers de suporte. Os sinais originados em cada transdutor foram filtrados e processados de modo a detetar de forma automática os ecos refletidos, e assim, alertar o cirurgião quando existem variações anatómicas ao longo do caminho de punção. Os transdutores foram mapeados num tanque de água e testados em 45 phantoms. Os resultados mostraram que o feixe de área em corte transversal oscila em torno do raio de cerâmica, e que os ecos refletidos são detetados em phantoms com comprimentos superiores a 80 mm. Desta forma, é expectável que a introdução deste novo sistema a nível do ARP permitirá conduzir o cirurgião ao longo do caminho de punção ideal, aumentado a confiança do cirurgião e reduzindo possíveis complicações (p.e. a perfuração dos órgãos). Além disso, de realçar que este sistema apresenta o potencial de tornar o ARP livre de radiação e alarga-lo a cirurgiões menos especializados.The present work was only possible thanks to the support by the Portuguese Science and Technology Foundation through the PhD grant with reference SFRH/BD/74276/2010 funded by FCT/MEC (PIDDAC) and by Fundo Europeu de Desenvolvimento Regional (FEDER), Programa COMPETE - Programa Operacional Factores de Competitividade (POFC) do QREN

    Movement-Related Desynchronization in EEG-based Brain-Computer Interface applications for stroke motor rehabilitation

    Get PDF
    Neurological degenerative diseases like stroke, Alzheimer, Amyothrophic Lateral Sclerosis (ALS), Parkinson and many others are constantly increasing their incidence in the world health statistics as far as the mean age of the global population is getting higher and higher. This leads to a general need for effective, at-home and low-cost rehabilitative and health-daily-care tools. The latter should consist either of technological devices implemented for operating in a remote way, i.e. tele-medicine is quickly spreading around the world, or very-advanced computer-based and robotic systems to realize intense and repetitive trainings. This is the challenge in which Information and Communications Technology (ICT) is asked to play a major role in order to bring medicine to reach further advancements. Indeed, no way to cope with these issues is possible outside a strong and vivid cooperation among multi-disciplinary teams of clinicians, physicians, biologists, neuro-psychologists and engineers and without a resolute pushing towards a widespread inter-operability between Institutes, Hospitals and Universities all over the world, as recently highlighted during the main International conferences on ICT in healthcare. The establishment of well-defined standards for gathering and sharing data will then represent a key element to enhance the efficacy of the aforementioned collaborations. Among the others, stroke is one of the most common neurological pathologies being the second or third cause of mortality in the world; moreover, it causes more than sixty percent survivors remain with severe cognitive and motor impairments that impede them in living normal lives and require a twenty-four-hours daily care. As a consequence, on one side stroke survivors experience a frustrating condition of being completely dependent on other people even to perform simple daily actions like reach and grasp an object, hold a glass of water to drink it and so on. States, by their side, have to take into account additional costs to provide stroke patients and their families with appropriate cares and supports to cope with their needs. For this reason, more and more fundings are recently made available by means of grants, European and International projects, programs to exchange different expertise among various countries with the aim to study how to accelerate and make more effective the recovery process of chronic stroke patients. The global research about this topic is conducted on several parallel aspects: as regard as the basic knowledge of brain processes, neurophysiologists, biologists and engineers are particularly interested in an in-depth understanding of the so-called neuroplastic changes that brain daily operates in order to adapt individuals to life changes, experiences and to realize more extensively their own potentialities. Neuroplasticity is indeed the corner stone for most of the trainings nowadays adopted by the standard as well as the more innovative methods in the rehabilitative programs for post-stroke recovery. Specifically speaking, motor rehabilitation usually includes long term, repetitive and intense goal-directed exercises that promote neuroplastic mechanisms such as neural sprouting, synapto-genesis and dendritic branching. These processes are strictly related with motor improvements and their study could - one day - serve as prognostic measures of the recovery. Another aspect of this eld of neuroscience research is the number of applications that it makes feasible. One of the most exciting is to connect an injured brain to a computer or a robotic device in a Brain-Computer or Brain-Machine Interface (BCI or BMI) scheme aiming at bypassing the impairments of the patient and make him/her autonomously move again or train his/her motor abilities in a more effective way. This kind of research can already count an amount of literature that provides several proofs of concept that these heterogeneous systems constituted by humans and robots can work at the purpose. A particular application of BCI for restoring or enhancing, at least, the reaching abilities of chronic stroke survivors was implemented and is still currently being improved at I.R.C.C.S. San Camillo Hospital Foundation, an Institute for the rehabilitation from neurological diseases located in Lido of Venice and partially technically supported by the Department of Information Engineering of Padua in range of an agreement signed in 2009. This specific BCI platform allows patients to train and improve their reaching movements by means of a robotic arm that provides a force that helps patients in completing the training exercise, i.e. to hit a predetermined target. This force feedback is however subject to a strict condition: during the movement, the person has to produce the expected pattern of cerebral activity. Whenever this is accomplished, a force is delivered proportionally to the entity of the latter activity, otherwise the patient is obliged to operate without any help. In this way, this platform implements the so-called operant-learning, that is one of the most effective conditioning techniques to make a subject learn or re-learn a task. If, on one hand, the primary and explicit task is to improve a movement, on the other side the secondary but most important task is to deploy the perilesional part of the brain - still healthy - in becoming responsible for the control of the movement. It is a popular and widely-accepted opinion within the neuroscience community, indeed, that a healthy region of the sensorimotor area nearby the damaged one - which was previously in charge of performing the (reaching) movement - can optimally accomplish the impaired motor function substituting the original control area. Technically speaking, the main crucial feature that can ensure the effectiveness of the whole system is the precise and in real-time identification and quantification of the cerebral pattern associated with the movement, the worldwide named movement-related desynchronization (MRD). Starting from its original definition, passing through the most used techniques for its recognition, the thesis work presents a series of criticisms of the current signal processing method to detect the MRD and a complete analysis of the possible features that can better represent the movement condition and that can be more easily extracted during the on-line operations. Brain - it is well-known - learns by trials and errors and it needs a slightly-delayed (in the range of fraction of seconds) feedback of its performance to learn a task in the best way. This BCI application was born with the purpose to provide the above-mentioned feedback: however, this is only feasible if a computationally easy and contingent signal processing technique is available. This thesis work would like to cope with the lack of a well-planned real-time signal analysis in the current experimental protocol

    Speech analysis for Ambient Assisted Living : technical and user design of a vocal order system

    No full text
    International audienceEvolution of ICT led to the emergence of smart home. A Smart Home consists in a home equipped with data-processing technology which anticipates the needs of its inhabitant while trying to maintain their comfort and their safety by action on the house and by implementing connections with the outside world. Therefore, smart homes equipped with ambient intelligence technology constitute a promising direction to enable the growing number of elderly to continue to live in their own homes as long as possible. However, the technological solutions requested by this part of the population have to suit their specific needs and capabilities. It is obvious that these Smart Houses tend to be equipped with devices whose interfaces are increasingly complex and become difficult to control by the user. The people the most likely to benefit from these new technologies are the people in loss of autonomy such as the disabled people or the elderly which cognitive deficiencies (Alzheimer). Moreover, these people are the less capable of using the complex interfaces due to their handicap or their lack ICT understanding. Thus, it becomes essential to facilitate the daily life and the access to the whole home automation system through the smart home. The usual tactile interfaces should be supplemented by accessible interfaces, in particular, thanks to a system reactive to the voice ; these interfaces are also useful when the person cannot move easily. Vocal orders will allow the following functionality: - To ensure an assistance by a traditional or vocal order. - To set up a indirect order regulation for a better energy management. - To reinforce the link with the relatives by the integration of interfaces dedicated and adapted to the person in loss of autonomy. - To ensure more safety by detection of distress situations and when someone is breaking in the house. This chapter will describe the different steps which are needed for the conception of an audio ambient system. The first step is related to the acceptability and the objection aspects by the end users and we will report a user evaluation assessing the acceptance and the fear of this new technology. The experience aimed at testing three important aspects of speech interaction: voice command, communication with the outside world, home automation system interrupting a person's activity. The experiment was conducted in a smart home with a voice command using a Wizard of OZ technique and gave information of great interest. The second step is related to a general presentation of the audio sensing technology for ambient assisted living. Different aspect of sound and speech processing will be developed. The applications and challenges will be presented. The third step is related to speech recognition in the home environment. Automatic Speech Recognition systems (ASR) have reached good performances with close talking microphones (e.g., head-set), but the performances decrease significantly as soon as the microphone is moved away from the mouth of the speaker (e.g., when the microphone is set in the ceiling). This deterioration is due to a broad variety of effects including reverberation and presence of undetermined background noise such as TV radio and, devices. This part will present a system of vocal order recognition in distant speech context. This system was evaluated in a dedicated flat thanks to some experiments. This chapter will then conclude with a discussion on the interest of the speech modality concerning the Ambient Assisted Living

    Computational Intelligence in Electromyography Analysis

    Get PDF
    Electromyography (EMG) is a technique for evaluating and recording the electrical activity produced by skeletal muscles. EMG may be used clinically for the diagnosis of neuromuscular problems and for assessing biomechanical and motor control deficits and other functional disorders. Furthermore, it can be used as a control signal for interfacing with orthotic and/or prosthetic devices or other rehabilitation assists. This book presents an updated overview of signal processing applications and recent developments in EMG from a number of diverse aspects and various applications in clinical and experimental research. It will provide readers with a detailed introduction to EMG signal processing techniques and applications, while presenting several new results and explanation of existing algorithms. This book is organized into 18 chapters, covering the current theoretical and practical approaches of EMG research
    corecore