32,707 research outputs found

    Multimodal Grounding for Language Processing

    Get PDF
    This survey discusses how recent developments in multimodal processing facilitate conceptual grounding of language. We categorize the information flow in multimodal processing with respect to cognitive models of human information processing and analyze different methods for combining multimodal representations. Based on this methodological inventory, we discuss the benefit of multimodal grounding for a variety of language processing tasks and the challenges that arise. We particularly focus on multimodal grounding of verbs which play a crucial role for the compositional power of language.Comment: The paper has been published in the Proceedings of the 27 Conference of Computational Linguistics. Please refer to this version for citations: https://www.aclweb.org/anthology/papers/C/C18/C18-1197

    Efficient contour-based shape representation and matching

    Get PDF
    This paper presents an efficient method for calculating the similarity between 2D closed shape contours. The proposed algorithm is invariant to translation, scale change and rotation. It can be used for database retrieval or for detecting regions with a particular shape in video sequences. The proposed algorithm is suitable for real-time applications. In the first stage of the algorithm, an ordered sequence of contour points approximating the shapes is extracted from the input binary images. The contours are translation and scale-size normalized, and small sets of the most likely starting points for both shapes are extracted. In the second stage, the starting points from both shapes are assigned into pairs and rotation alignment is performed. The dissimilarity measure is based on the geometrical distances between corresponding contour points. A fast sub-optimal method for solving the correspondence problem between contour points from two shapes is proposed. The dissimilarity measure is calculated for each pair of starting points. The lowest dissimilarity is taken as the final dissimilarity measure between two shapes. Three different experiments are carried out using the proposed approach: letter recognition using a web camera, our own simulation of Part B of the MPEG-7 core experiment ā€œCE-Shape1ā€ and detection of characters in cartoon video sequences. Results indicate that the proposed dissimilarity measure is aligned with human intuition

    Interaction Issues in Computer Aided Semantic\ud Annotation of Multimedia

    Get PDF
    The CASAM project aims to provide a tool for more efficient and effective annotation of multimedia documents through collaboration between a user and a system performing an automated analysis of the media content. A critical part of the project is to develop a user interface which best supports both the user and the system through optimal human-computer interaction. In this paper we discuss the work undertaken, the proposed user interface and underlying interaction issues which drove its development
    • ā€¦
    corecore