188 research outputs found

    Discriminating visible speech tokens using multi-modality

    Get PDF
    Proceedings of the 9th International Conference on Auditory Display (ICAD), Boston, MA, July 7-9, 2003.We present a multimodal interactive data exploration tool that facilitates discrimination between visible speech tokens. The multimodal tool uses visualization and sonification (non-speech sound) of data. Visible speech tokens is a class of multidimensional data that have been used extensively in designing talking head that has been used in training of deaf individuals by watching speech [1]. Visible speech tokens (consonants), referred to as categories, differ along a set of pre-measured feature dimensions such as mouth height, mouth narrowing, jaw rotation and upper-lip retraction. The data set was visualized with a series of 1D scatter-plots that differed in color for each category. Sonification was performed by mapping three qualities of the data (within-category variability, between category variability, and category identity) to three sound parameters (noise amplitude, duration, and pitch). An experiment was conducted to assess the utility of multimodal information compared to visual information alone for exploring this multidimensional data set. Tasks involved answering a series of questions to determine how well each feature or a set of features discriminate among categories, which categories are discriminated and how many. Performance was assessed by measuring accuracy and reaction time to 36 questions varying in scale of understanding and level of dimension integrality. Scale varied at three levels (ratio, ordinal, and nominal) and integrality also varied at three levels (1, 2 , and 3 dimensions). A between-subjects design was used by assigning subjects to either the multimodal group or visual only group. Results show that accuracy is better for the multimodal group as the number of dimensions required to answer a question (integrality) increased. Also, accuracy was 10% better for the multimodal group for ordinal questions. For discriminating visible speech tokens, sonification provides useful information in addition to that given by visualization, particularly for representing three dimensions simultaneously

    Categorical results do not imply categorical perception

    Full text link
    corecore