72,898 research outputs found
An empirical study of inter-concept similarities in multimedia ontologies
Generic concept detection has been a widely studied topic in recent research on multimedia analysis and retrieval, but the issue of how to exploit the structure of a multimedia ontology as well as different inter-concept relations, has not received similar attention. In this paper, we present results from our empirical analysis of different types of similarity among semantic concepts in two multimedia ontologies, LSCOM-Lite and CDVP-206. The results show promise that the proposed methods may be helpful in providing insight into the existing inter-concept relations within an ontology and selecting the most facilitating set of concepts and hierarchical relations. Such an analysis as this can be utilized in various tasks such as building more reliable concept detectors and designing large-scale ontologies
Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval
This paper presents a new state-of-the-art for document image classification
and retrieval, using features learned by deep convolutional neural networks
(CNNs). In object and scene analysis, deep neural nets are capable of learning
a hierarchical chain of abstraction from pixel inputs to concise and
descriptive representations. The current work explores this capacity in the
realm of document analysis, and confirms that this representation strategy is
superior to a variety of popular hand-crafted alternatives. Experiments also
show that (i) features extracted from CNNs are robust to compression, (ii) CNNs
trained on non-document images transfer well to document analysis tasks, and
(iii) enforcing region-specific feature-learning is unnecessary given
sufficient training data. This work also makes available a new labelled subset
of the IIT-CDIP collection, containing 400,000 document images across 16
categories, useful for training new CNNs for document analysis
On morphological hierarchical representations for image processing and spatial data clustering
Hierarchical data representations in the context of classi cation and data
clustering were put forward during the fties. Recently, hierarchical image
representations have gained renewed interest for segmentation purposes. In this
paper, we briefly survey fundamental results on hierarchical clustering and
then detail recent paradigms developed for the hierarchical representation of
images in the framework of mathematical morphology: constrained connectivity
and ultrametric watersheds. Constrained connectivity can be viewed as a way to
constrain an initial hierarchy in such a way that a set of desired constraints
are satis ed. The framework of ultrametric watersheds provides a generic scheme
for computing any hierarchical connected clustering, in particular when such a
hierarchy is constrained. The suitability of this framework for solving
practical problems is illustrated with applications in remote sensing
Spatial Aggregation: Theory and Applications
Visual thinking plays an important role in scientific reasoning. Based on the
research in automating diverse reasoning tasks about dynamical systems,
nonlinear controllers, kinematic mechanisms, and fluid motion, we have
identified a style of visual thinking, imagistic reasoning. Imagistic reasoning
organizes computations around image-like, analogue representations so that
perceptual and symbolic operations can be brought to bear to infer structure
and behavior. Programs incorporating imagistic reasoning have been shown to
perform at an expert level in domains that defy current analytic or numerical
methods. We have developed a computational paradigm, spatial aggregation, to
unify the description of a class of imagistic problem solvers. A program
written in this paradigm has the following properties. It takes a continuous
field and optional objective functions as input, and produces high-level
descriptions of structure, behavior, or control actions. It computes a
multi-layer of intermediate representations, called spatial aggregates, by
forming equivalence classes and adjacency relations. It employs a small set of
generic operators such as aggregation, classification, and localization to
perform bidirectional mapping between the information-rich field and
successively more abstract spatial aggregates. It uses a data structure, the
neighborhood graph, as a common interface to modularize computations. To
illustrate our theory, we describe the computational structure of three
implemented problem solvers -- KAM, MAPS, and HIPAIR --- in terms of the
spatial aggregation generic operators by mixing and matching a library of
commonly used routines.Comment: See http://www.jair.org/ for any accompanying file
Segmentation-based video coding:temporals links
This paper analyzes the main elements that a segmentation-based video coding approach should be based on so that it can address coding efficiency and content-based functionalities. Such elements can be defined as temporal linking and rate control. The basic features of such elements are discussed and, in both cases, a specific implementation is proposed.Peer ReviewedPostprint (published version
A graph-based mathematical morphology reader
This survey paper aims at providing a "literary" anthology of mathematical
morphology on graphs. It describes in the English language many ideas stemming
from a large number of different papers, hence providing a unified view of an
active and diverse field of research
- …