9,407 research outputs found
Post processing of multimedia information - concepts, problems, and techniques
Currently, most research work on multimedia information processing is focused on multimedia information storage and retrieval, especially indexing and content-based access of multimedia information. We consider multimedia information processing should include one more level-post-processing. Here "post-processing" means further processing of retrieved multimedia information, which includes fusion of multimedia information and reasoning with multimedia information to reach new conclusions. In this paper, the three levels of multimedia information processing storage, retrieval, and post-processing- are discussed. The concepts and problems of multimedia information post-processing are identified. Potential techniques that can be used in post-processing are suggested, By highlighting the problems in multimedia information post-processing, hopefully this paper will stimulate further research on this important but ignored topic.<br /
Thick 2D Relations for Document Understanding
We use a propositional language of qualitative rectangle relations to detect the reading order from document images. To this end, we define the notion of a document encoding rule and we analyze possible formalisms to express document encoding rules such as LATEX and SGML. Document encoding rules expressed in the propositional language of rectangles are used to build a reading order detector for document images. In order to achieve robustness and avoid brittleness when applying the system to real life document images, the notion of a thick boundary interpretation for a qualitative relation is introduced. The framework is tested on a collection of heterogeneous document images showing recall rates up to 89%
Quantum Information Dynamics and Open World Science
One of the fundamental insights of quantum mechanics is that complete knowledge of the state of a quantum system is not possible. Such incomplete knowledge of a physical system is the norm rather than the exception. This is becoming increasingly apparent as we apply scientific methods to increasingly complex situations. Empirically intensive disciplines in the biological, human, and geosciences all operate in situations where valid conclusions must be drawn, but deductive completeness is impossible. This paper argues that such situations are emerging examples of {it Open World} Science. In this paradigm, scientific models are known to be acting with incomplete information. Open World models acknowledge their incompleteness, and respond positively when new information becomes available. Many methods for creating Open World models have been explored analytically in quantitative disciplines such as statistics, and the increasingly mature area of machine learning. This paper examines the role of quantum theory and quantum logic in the underpinnings of Open World models, examining the importance of structural features of such as non-commutativity, degrees of similarity, induction, and the impact of observation. Quantum mechanics is not a problem around the edges of classical theory, but is rather a secure bridgehead in the world of science to come
The Virtual Image in Streaming Video Indexing
Multimedia technology has been applied to many types of applications and the great amount of multimedia data need to be indexed. Especially the usage of digital video data is very popular today. In particular video browsing is a necessary activity in many kinds of knowledge. For effective and interactive exploration of large digital video archives there is a need to index the videos using their visual, audio and textual data. In this paper, we focus on the visual and textual content of video for indexing. In the former approach we use the Virtual Image and in the latter one we use the Dublin Core Metadata, opportunely extended and multilayered for the video browsing and indexing. Before to concentrate our attemption on the visual content we will explain main methods to video segmentation and annotation, in order to introduce the steps for video keyfeature extraction and video description generation
Spatial Aggregation: Theory and Applications
Visual thinking plays an important role in scientific reasoning. Based on the
research in automating diverse reasoning tasks about dynamical systems,
nonlinear controllers, kinematic mechanisms, and fluid motion, we have
identified a style of visual thinking, imagistic reasoning. Imagistic reasoning
organizes computations around image-like, analogue representations so that
perceptual and symbolic operations can be brought to bear to infer structure
and behavior. Programs incorporating imagistic reasoning have been shown to
perform at an expert level in domains that defy current analytic or numerical
methods. We have developed a computational paradigm, spatial aggregation, to
unify the description of a class of imagistic problem solvers. A program
written in this paradigm has the following properties. It takes a continuous
field and optional objective functions as input, and produces high-level
descriptions of structure, behavior, or control actions. It computes a
multi-layer of intermediate representations, called spatial aggregates, by
forming equivalence classes and adjacency relations. It employs a small set of
generic operators such as aggregation, classification, and localization to
perform bidirectional mapping between the information-rich field and
successively more abstract spatial aggregates. It uses a data structure, the
neighborhood graph, as a common interface to modularize computations. To
illustrate our theory, we describe the computational structure of three
implemented problem solvers -- KAM, MAPS, and HIPAIR --- in terms of the
spatial aggregation generic operators by mixing and matching a library of
commonly used routines.Comment: See http://www.jair.org/ for any accompanying file
The Outline of an 'Intelligent' Image Retrieval Engine
International audienceThe first image retrieval systems hold the advantage of being fully automatic, and thus scalable to large collections of images but are restricted to the representation of low-level aspects (e.g. colors, textures...) without considering the semantic content of images. This obviously compromises interaction, making it difficult for a user to query with precision. The growing need for 'intelligent' systems, i.e. being capable of bridging this semantic gap, leads to new architectures combining multiple characterizations of the image content. This paper presents SIR1, a promising high-level framework featuring semantics, signal color and spatial characterizations. It features a fully-textual query module based on a language manipulating both boolean and quantification operators, therefore making it possible for a user to request elaborate image scenes such as a "covered(mostly grey) sky" or "people in front of a building"
- âŠ