872 research outputs found

    Automatic Emotion Recognition from Mandarin Speech

    Get PDF

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Gut Microorganisms of Aquatic Animals 2.0

    Get PDF
    This is a collection of scientific research articles focusing on the associations and/or interactions of various aquatic animals with their microorganisms, focusing mostly on fish and, in particular, gut bacterial communities

    A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation

    Full text link
    Body language (BL) refers to the non-verbal communication expressed through physical movements, gestures, facial expressions, and postures. It is a form of communication that conveys information, emotions, attitudes, and intentions without the use of spoken or written words. It plays a crucial role in interpersonal interactions and can complement or even override verbal communication. Deep multi-modal learning techniques have shown promise in understanding and analyzing these diverse aspects of BL. The survey emphasizes their applications to BL generation and recognition. Several common BLs are considered i.e., Sign Language (SL), Cued Speech (CS), Co-speech (CoS), and Talking Head (TH), and we have conducted an analysis and established the connections among these four BL for the first time. Their generation and recognition often involve multi-modal approaches. Benchmark datasets for BL research are well collected and organized, along with the evaluation of SOTA methods on these datasets. The survey highlights challenges such as limited labeled data, multi-modal learning, and the need for domain adaptation to generalize models to unseen speakers or languages. Future research directions are presented, including exploring self-supervised learning techniques, integrating contextual information from other modalities, and exploiting large-scale pre-trained multi-modal models. In summary, this survey paper provides a comprehensive understanding of deep multi-modal learning for various BL generations and recognitions for the first time. By analyzing advancements, challenges, and future directions, it serves as a valuable resource for researchers and practitioners in advancing this field. n addition, we maintain a continuously updated paper list for deep multi-modal learning for BL recognition and generation: https://github.com/wentaoL86/awesome-body-language

    Early and Late Stage Mechanisms for Vocalization Processing in the Human Auditory System

    Get PDF
    The human auditory system is able to rapidly process incoming acoustic information, actively filtering, categorizing, or suppressing different elements of the incoming acoustic stream. Vocalizations produced by other humans (conspecifics) likely represent the most ethologically-relevant sounds encountered by hearing individuals. Subtle acoustic characteristics of these vocalizations aid in determining the identity, emotional state, health, intent, etc. of the producer. The ability to assess vocalizations is likely subserved by a specialized network of structures and functional connections that are optimized for this stimulus class. Early elements of this network would show sensitivity to the most basic acoustic features of these sounds; later elements may show categorically-selective response patterns that represent high-level semantic organization of different classes of vocalizations. A combination of functional magnetic resonance imaging and electrophysiological studies were performed to investigate and describe some of the earlier and later stage mechanisms of conspecific vocalization processing in human auditory cortices. Using fMRI, cortical representations of harmonic signal content were found along the middle superior temporal gyri between primary auditory cortices along Heschl\u27s gyri and the superior temporal sulci, higher-order auditory regions. Additionally, electrophysiological findings also demonstrated a parametric response profile to harmonic signal content. Utilizing a novel class of vocalizations, human-mimicked versions of animal vocalizations, we demonstrated the presence of a left-lateralized cortical vocalization processing hierarchy to conspecific vocalizations, contrary to previous findings describing similar bilateral networks. This hierarchy originated near primary auditory cortices and was further supported by auditory evoked potential data that suggests differential temporal processing dynamics of conspecific human vocalizations versus those produced by other species. Taken together, these results suggest that there are auditory cortical networks that are highly optimized for processing utterances produced by the human vocal tract. Understanding the function and structure of these networks will be critical for advancing the development of novel communicative therapies and the design of future assistive hearing devices

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis
    • …
    corecore