3 research outputs found

    Learning Semantic Components from Subsymbolic Multimodal Perception

    Get PDF
    International audiencePerceptual systems often include sensors from several modalities. However, existing robots do not yet sufficiently discover patterns that are spread over the flow of multimodal data they receive. In this paper we present a framework that learns a dictionary of words from full spoken utterances, together with a set of gestures from human demonstrations and the semantic connection between words and gestures. We explain how to use a nonnegative matrix factorization algorithm to learn a dictionary of components that represent meaningful elements present in the multimodal perception, without providing the system with a symbolic representation of the semantics. We illustrate this framework by showing how a learner discovers word-like components from observation of gestures made by a human together with spoken descriptions of the gestures, and how it captures the semantic association between the two

    Learning from Images and Speech with Non-negative Matrix Factorization Enhanced by Input Space Scaling

    No full text
    Computional learning from multimodal data is often done with matrix factorization techniques such as NMF (Non-negative Matrix Factorization), pLSA (Probabilistic Latent Semantic Analysis) or LDA (Latent Dirichlet Allocation). The different modalities of the input are to this end converted into features that are easily placed in a vectorized format. An inherent weakness of such a data representation is that only a subset of these data features actually aids the learning. In this paper, we first describe a simple NMF-based recognition framework operating on speech and image data. We then propose and demonstrate a novel algorithm that scales the inputs of this framework in order to optimize its recognition performance. ©2010 IEEE.Driesen J., Van hamme H., Kleijn W.B., ''Learning from Images and Speech with Non-negative Matrix Factorization Enhanced by Input Space Scaling. '', Proceedings IEEE workshop on spoken language technology - SLT 2010, December 12-15, 2010, Berkeley, California, USA.status: publishe

    Learning from images and speech with Non-negative Matrix Factorization enhanced by input space scaling

    No full text
    corecore