72,898 research outputs found

    An empirical study of inter-concept similarities in multimedia ontologies

    Get PDF
    Generic concept detection has been a widely studied topic in recent research on multimedia analysis and retrieval, but the issue of how to exploit the structure of a multimedia ontology as well as different inter-concept relations, has not received similar attention. In this paper, we present results from our empirical analysis of different types of similarity among semantic concepts in two multimedia ontologies, LSCOM-Lite and CDVP-206. The results show promise that the proposed methods may be helpful in providing insight into the existing inter-concept relations within an ontology and selecting the most facilitating set of concepts and hierarchical relations. Such an analysis as this can be utilized in various tasks such as building more reliable concept detectors and designing large-scale ontologies

    Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval

    Full text link
    This paper presents a new state-of-the-art for document image classification and retrieval, using features learned by deep convolutional neural networks (CNNs). In object and scene analysis, deep neural nets are capable of learning a hierarchical chain of abstraction from pixel inputs to concise and descriptive representations. The current work explores this capacity in the realm of document analysis, and confirms that this representation strategy is superior to a variety of popular hand-crafted alternatives. Experiments also show that (i) features extracted from CNNs are robust to compression, (ii) CNNs trained on non-document images transfer well to document analysis tasks, and (iii) enforcing region-specific feature-learning is unnecessary given sufficient training data. This work also makes available a new labelled subset of the IIT-CDIP collection, containing 400,000 document images across 16 categories, useful for training new CNNs for document analysis

    On morphological hierarchical representations for image processing and spatial data clustering

    Full text link
    Hierarchical data representations in the context of classi cation and data clustering were put forward during the fties. Recently, hierarchical image representations have gained renewed interest for segmentation purposes. In this paper, we briefly survey fundamental results on hierarchical clustering and then detail recent paradigms developed for the hierarchical representation of images in the framework of mathematical morphology: constrained connectivity and ultrametric watersheds. Constrained connectivity can be viewed as a way to constrain an initial hierarchy in such a way that a set of desired constraints are satis ed. The framework of ultrametric watersheds provides a generic scheme for computing any hierarchical connected clustering, in particular when such a hierarchy is constrained. The suitability of this framework for solving practical problems is illustrated with applications in remote sensing

    Spatial Aggregation: Theory and Applications

    Full text link
    Visual thinking plays an important role in scientific reasoning. Based on the research in automating diverse reasoning tasks about dynamical systems, nonlinear controllers, kinematic mechanisms, and fluid motion, we have identified a style of visual thinking, imagistic reasoning. Imagistic reasoning organizes computations around image-like, analogue representations so that perceptual and symbolic operations can be brought to bear to infer structure and behavior. Programs incorporating imagistic reasoning have been shown to perform at an expert level in domains that defy current analytic or numerical methods. We have developed a computational paradigm, spatial aggregation, to unify the description of a class of imagistic problem solvers. A program written in this paradigm has the following properties. It takes a continuous field and optional objective functions as input, and produces high-level descriptions of structure, behavior, or control actions. It computes a multi-layer of intermediate representations, called spatial aggregates, by forming equivalence classes and adjacency relations. It employs a small set of generic operators such as aggregation, classification, and localization to perform bidirectional mapping between the information-rich field and successively more abstract spatial aggregates. It uses a data structure, the neighborhood graph, as a common interface to modularize computations. To illustrate our theory, we describe the computational structure of three implemented problem solvers -- KAM, MAPS, and HIPAIR --- in terms of the spatial aggregation generic operators by mixing and matching a library of commonly used routines.Comment: See http://www.jair.org/ for any accompanying file

    Segmentation-based video coding:temporals links

    Get PDF
    This paper analyzes the main elements that a segmentation-based video coding approach should be based on so that it can address coding efficiency and content-based functionalities. Such elements can be defined as temporal linking and rate control. The basic features of such elements are discussed and, in both cases, a specific implementation is proposed.Peer ReviewedPostprint (published version

    A graph-based mathematical morphology reader

    Full text link
    This survey paper aims at providing a "literary" anthology of mathematical morphology on graphs. It describes in the English language many ideas stemming from a large number of different papers, hence providing a unified view of an active and diverse field of research
    corecore