21,560 research outputs found
Structured Knowledge Representation for Image Retrieval
We propose a structured approach to the problem of retrieval of images by
content and present a description logic that has been devised for the semantic
indexing and retrieval of images containing complex objects. As other
approaches do, we start from low-level features extracted with image analysis
to detect and characterize regions in an image. However, in contrast with
feature-based approaches, we provide a syntax to describe segmented regions as
basic objects and complex objects as compositions of basic ones. Then we
introduce a companion extensional semantics for defining reasoning services,
such as retrieval, classification, and subsumption. These services can be used
for both exact and approximate matching, using similarity measures. Using our
logical approach as a formal specification, we implemented a complete
client-server image retrieval system, which allows a user to pose both queries
by sketch and queries by example. A set of experiments has been carried out on
a testbed of images to assess the retrieval capabilities of the system in
comparison with expert users ranking. Results are presented adopting a
well-established measure of quality borrowed from textual information
retrieval
Semantic web technologies for video surveillance metadata
Video surveillance systems are growing in size and complexity. Such systems typically consist of integrated modules of different vendors to cope with the increasing demands on network and storage capacity, intelligent video analytics, picture quality, and enhanced visual interfaces. Within a surveillance system, relevant information (like technical details on the video sequences, or analysis results of the monitored environment) is described using metadata standards. However, different modules typically use different standards, resulting in metadata interoperability problems. In this paper, we introduce the application of Semantic Web Technologies to overcome such problems. We present a semantic, layered metadata model and integrate it within a video surveillance system. Besides dealing with the metadata interoperability problem, the advantages of using Semantic Web Technologies and the inherent rule support are shown. A practical use case scenario is presented to illustrate the benefits of our novel approach
An MPEG-7 scheme for semantic content modelling and filtering of digital video
Abstract Part 5 of the MPEG-7 standard specifies Multimedia Description Schemes (MDS); that is, the format multimedia content models should conform to in order to ensure interoperability across multiple platforms and applications. However, the standard does not specify how the content or the associated model may be filtered. This paper proposes an MPEG-7 scheme which can be deployed for digital video content modelling and filtering. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user. We present details of the scheme, front-end systems used for content modelling and filtering and experiences with a number of users
MaestROB: A Robotics Framework for Integrated Orchestration of Low-Level Control and High-Level Reasoning
This paper describes a framework called MaestROB. It is designed to make the
robots perform complex tasks with high precision by simple high-level
instructions given by natural language or demonstration. To realize this, it
handles a hierarchical structure by using the knowledge stored in the forms of
ontology and rules for bridging among different levels of instructions.
Accordingly, the framework has multiple layers of processing components;
perception and actuation control at the low level, symbolic planner and Watson
APIs for cognitive capabilities and semantic understanding, and orchestration
of these components by a new open source robot middleware called Project Intu
at its core. We show how this framework can be used in a complex scenario where
multiple actors (human, a communication robot, and an industrial robot)
collaborate to perform a common industrial task. Human teaches an assembly task
to Pepper (a humanoid robot from SoftBank Robotics) using natural language
conversation and demonstration. Our framework helps Pepper perceive the human
demonstration and generate a sequence of actions for UR5 (collaborative robot
arm from Universal Robots), which ultimately performs the assembly (e.g.
insertion) task.Comment: IEEE International Conference on Robotics and Automation (ICRA) 2018.
Video: https://www.youtube.com/watch?v=19JsdZi0TW
The relationship between IR and multimedia databases
Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud
\ud
Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud
\ud
Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud
\ud
First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud
\ud
Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud
\ud
Third, we add the functionality to process the users' relevance feedback.\ud
\ud
We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud
\ud
We conclude with an outline for implementation of miRRor on top of the Monet extensible database system
TEMPOS: A Platform for Developing Temporal Applications on Top of Object DBMS
This paper presents TEMPOS: a set of models and languages supporting the manipulation of temporal data on top of object DBMS. The proposed models exploit object-oriented technology to meet some important, yet traditionally neglected design criteria related to legacy code migration and representation independence. Two complementary ways for accessing temporal data are offered: a query language and a visual browser. The query language, namely TempOQL, is an extension of OQL supporting the manipulation of histories regardless of their representations, through fully composable functional operators. The visual browser offers operators that facilitate several time-related interactive navigation tasks, such as studying a snapshot of a collection of objects at a given instant, or detecting and examining changes within temporal attributes and relationships. TEMPOS models and languages have been formalized both at the syntactical and the semantical level and have been implemented on top of an object DBMS. The suitability of the proposals with regard to applications' requirements has been validated through concrete case studies
Semantics-based selection of everyday concepts in visual lifelogging
Concept-based indexing, based on identifying various semantic concepts appearing in multimedia, is an attractive option for multimedia retrieval and much research tries to bridge the semantic gap between the media’s low-level features and high-level semantics. Research into concept-based multimedia retrieval has generally focused on detecting concepts from high quality media such as broadcast TV or movies, but it is not well addressed in other domains like lifelogging where the original data is captured with poorer quality. We argue that in noisy domains such as lifelogging, the management of data needs to include semantic reasoning in order to deduce a set of concepts to represent lifelog content for applications like searching, browsing or summarisation. Using semantic concepts to manage lifelog data relies on the fusion of automatically-detected concepts to provide a better understanding of the lifelog data. In this paper, we investigate the selection of semantic concepts for lifelogging which includes reasoning on semantic networks using a density-based approach. In a series of experiments we compare different semantic reasoning approaches and the experimental evaluations we report on lifelog data show the efficacy of our approach
- …