3,129 research outputs found

    Engaging the articulators enhances perception of concordant visible speech movements

    Full text link
    PURPOSE This study aimed to test whether (and how) somatosensory feedback signals from the vocal tract affect concurrent unimodal visual speech perception. METHOD Participants discriminated pairs of silent visual utterances of vowels under 3 experimental conditions: (a) normal (baseline) and while holding either (b) a bite block or (c) a lip tube in their mouths. To test the specificity of somatosensory-visual interactions during perception, we assessed discrimination of vowel contrasts optically distinguished based on their mandibular (English /ɛ/-/æ/) or labial (English /u/-French /u/) postures. In addition, we assessed perception of each contrast using dynamically articulating videos and static (single-frame) images of each gesture (at vowel midpoint). RESULTS Engaging the jaw selectively facilitated perception of the dynamic gestures optically distinct in terms of jaw height, whereas engaging the lips selectively facilitated perception of the dynamic gestures optically distinct in terms of their degree of lip compression and protrusion. Thus, participants perceived visible speech movements in relation to the configuration and shape of their own vocal tract (and possibly their ability to produce covert vowel production-like movements). In contrast, engaging the articulators had no effect when the speaking faces did not move, suggesting that the somatosensory inputs affected perception of time-varying kinematic information rather than changes in target (movement end point) mouth shapes. CONCLUSIONS These findings suggest that orofacial somatosensory inputs associated with speech production prime premotor and somatosensory brain regions involved in the sensorimotor control of speech, thereby facilitating perception of concordant visible speech movements. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.9911846R01 DC002852 - NIDCD NIH HHSAccepted manuscrip

    Spanish Expressive Voices: corpus for emotion research in Spanish

    Get PDF
    A new emotional multimedia database has been recorded and aligned. The database comprises speech and video recordings of one actor and one actress simulating a neutral state and the Big Six emotions: happiness, sadness, anger, surprise, fear and disgust. Due to a careful design and its size (more than 100 minutes per emotion), the recorded database allows comprehensive studies on emotional speech synthesis, prosodic modelling, speech conversion, far-field speech recognition and speech and video-based emotion identification. The database has been automatically labelled for prosodic purposes (5% was manually revised). The whole database has been validated thorough objective and perceptual tests, achieving a validation score as high as 89%

    Aerospace Medicine and Biology: A continuing bibliography with indexes, supplement 165, March 1977

    Get PDF
    This bibliography lists 198 reports, articles, and other documents introduced into the NASA scientific and technical information system in February 1977

    CROEQS: Contemporaneous Role Ontology-based Expanded Query Search: implementation and evaluation

    Get PDF
    Searching annotated items in multimedia databases becomes increasingly important. The traditional approach is to build a search engine based on textual metadata. However, in manually annotated multimedia databases, the conceptual level of what is searched for might differ from the high-levelness of the annotations of the items. To address this problem, we present CROEQS, a semantically enhanced search engine. It allows the user to query the annotated persons not only on their name, but also on their roles at the time the multimedia item was broadcast. We also present the ontology used to expand such queries: it allows us to semantically represent the domain knowledge on people fulfilling a role during a temporal interval in general, and politicians holding a political office specifically. The evaluation results show that query expansion using data retrieved from an ontology considerably filters the result set, although there is a performance penalty

    Nanophotonic reservoir computing with photonic crystal cavities to generate periodic patterns

    Get PDF
    Reservoir computing (RC) is a technique in machine learning inspired by neural systems. RC has been used successfully to solve complex problems such as signal classification and signal generation. These systems are mainly implemented in software, and thereby they are limited in speed and power efficiency. Several optical and optoelectronic implementations have been demonstrated, in which the system has signals with an amplitude and phase. It is proven that these enrich the dynamics of the system, which is beneficial for the performance. In this paper, we introduce a novel optical architecture based on nanophotonic crystal cavities. This allows us to integrate many neurons on one chip, which, compared with other photonic solutions, closest resembles a classical neural network. Furthermore, the components are passive, which simplifies the design and reduces the power consumption. To assess the performance of this network, we train a photonic network to generate periodic patterns, using an alternative online learning rule called first-order reduced and corrected error. For this, we first train a classical hyperbolic tangent reservoir, but then we vary some of the properties to incorporate typical aspects of a photonics reservoir, such as the use of continuous-time versus discrete-time signals and the use of complex-valued versus real-valued signals. Then, the nanophotonic reservoir is simulated and we explore the role of relevant parameters such as the topology, the phases between the resonators, the number of nodes that are biased and the delay between the resonators. It is important that these parameters are chosen such that no strong self-oscillations occur. Finally, our results show that for a signal generation task a complex-valued, continuous-time nanophotonic reservoir outperforms a classical (i.e., discrete-time, real-valued) leaky hyperbolic tangent reservoir (normalized root-mean-square errors = 0.030 versus NRMSE = 0.127)

    A small vocabulary database of ultrasound image sequences of vocal tract dynamics

    Full text link
    This paper presents a new database consisting of concurrent articulatory and acoustic speech data. The articulatory data correspond to ultrasound videos of the vocal tract dynamics, which allow the visualization of the tongue upper contour during the speech production process. Acoustic data is composed of 30 short sentences that were acquired by a directional cardioid microphone. This database includes data from 17 young subjects (8 male and 9 female) from the Santander region in Colombia, who reported not having any speech pathology
    • …
    corecore