1,362 research outputs found

    A CNN-Based Approach to Identification of Degradations in Speech Signals

    Get PDF

    Velum movement detection based on surface electromyography for speech interface

    Get PDF
    Conventional speech communication systems do not perform well in the absence of an intelligible acoustic signal. Silent Speech Interfaces enable speech communication to take place with speech-handicapped users and in noisy environments. However, since no acoustic signal is available, information on nasality may be absent, which is an important and relevant characteristic of several languages, particularly European Portuguese. In this paper we propose a non-invasive method - surface Electromyography (EMG) electrodes - positioned in the face and neck regions to explore the existence of useful information about the velum movement. The applied procedure takes advantage of Real-Time Magnetic Resonance Imaging (RT-MRI) data, collected from the same speakers, to interpret and validate EMG data. By ensuring compatible scenario conditions and proper alignment between the EMG and RT-MRI data, we are able to estimate when the velum moves and the probable type of movement under a nasality occurrence. Overall results of this experiment revealed interesting and distinct characteristics in the EMG signal when a nasal vowel is uttered and that it is possible to detect velum movement, particularly by sensors positioned below the ear between the mastoid process and the mandible in the upper neck region.info:eu-repo/semantics/publishedVersio

    Human summating potential using continuous loop averaging deconvolution: Response amplitudes vary with tone burst repetition rate and duration

    Get PDF
    Electrocochleography (ECochG) to high repetition rate tone bursts may have advantages over ECochG to clicks with standard slow rates. Tone burst stimuli presented at a high repetition rate may enhance summating potential (SP) measurements by reducing neural contributions resulting from neural adaptation to high stimulus repetition rates. To allow for the analysis of the complex ECochG responses to high rates, we deconvolved responses using the Continuous Loop Averaging Deconvolution (CLAD) technique. We examined the effect of high stimulus repetition rate and stimulus duration on SP amplitude measurements made with extratympanic ECochG to tone bursts in 20 adult females with normal hearing. We used 500 and 2,000 Hz tone bursts of various stimulus durations (12, 6, 3 ms) and repetition rates (five rates ranging from 7.1 to 234.38/s). A within-subject repeated measures (rate x duration) analysis of variance was conducted. We found that, for both 500 and 2,000 Hz stimuli, the mean deconvolved SP amplitudes were larger at faster repetition rates (58.59 and 97.66/s) compared to slower repetition rates (7.1 and 19.53/s), and larger at shorter stimulus duration compared longer stimulus duration. Our concluding hypothesis is that large SP amplitude to short duration stimuli may originate primarily from neural excitation, and large SP amplitudes to long duration, fast repetition rate stimuli may originate from hair cell responses. While the hair cell or neural origins of the SP to various stimulus parameters remains to be validated, our results nevertheless provide normative data as a step toward applying the CLAD technique to understanding diseased ears

    Evaluation of room acoustic qualities and defects by use of auralization

    Get PDF

    Towards a Multimodal Silent Speech Interface for European Portuguese

    Get PDF
    Automatic Speech Recognition (ASR) in the presence of environmental noise is still a hard problem to tackle in speech science (Ng et al., 2000). Another problem well described in the literature is the one concerned with elderly speech production. Studies (Helfrich, 1979) have shown evidence of a slower speech rate, more breaks, more speech errors and a humbled volume of speech, when comparing elderly with teenagers or adults speech, on an acoustic level. This fact makes elderly speech hard to recognize, using currently available stochastic based ASR technology. To tackle these two problems in the context of ASR for HumanComputer Interaction, a novel Silent Speech Interface (SSI) in European Portuguese (EP) is envisioned.info:eu-repo/semantics/acceptedVersio
    • …
    corecore