14,474 research outputs found
Multi modal multi-semantic image retrieval
PhDThe rapid growth in the volume of visual information, e.g. image, and video can
overwhelm users’ ability to find and access the specific visual information of interest
to them. In recent years, ontology knowledge-based (KB) image information retrieval
techniques have been adopted into in order to attempt to extract knowledge from these
images, enhancing the retrieval performance. A KB framework is presented to
promote semi-automatic annotation and semantic image retrieval using multimodal
cues (visual features and text captions). In addition, a hierarchical structure for the KB
allows metadata to be shared that supports multi-semantics (polysemy) for concepts.
The framework builds up an effective knowledge base pertaining to a domain specific
image collection, e.g. sports, and is able to disambiguate and assign high level
semantics to ‘unannotated’ images.
Local feature analysis of visual content, namely using Scale Invariant Feature
Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’
model (BVW) as an effective method to represent visual content information and to
enhance its classification and retrieval. Local features are more useful than global
features, e.g. colour, shape or texture, as they are invariant to image scale, orientation
and camera angle. An innovative approach is proposed for the representation,
annotation and retrieval of visual content using a hybrid technique based upon the use
of an unstructured visual word and upon a (structured) hierarchical ontology KB
model. The structural model facilitates the disambiguation of unstructured visual
words and a more effective classification of visual content, compared to a vector
space model, through exploiting local conceptual structures and their relationships.
The key contributions of this framework in using local features for image
representation include: first, a method to generate visual words using the semantic
local adaptive clustering (SLAC) algorithm which takes term weight and spatial
locations of keypoints into account. Consequently, the semantic information is
preserved. Second a technique is used to detect the domain specific ‘non-informative
visual words’ which are ineffective at representing the content of visual data and
degrade its categorisation ability. Third, a method to combine an ontology model with
xi
a visual word model to resolve synonym (visual heterogeneity) and polysemy
problems, is proposed. The experimental results show that this approach can discover
semantically meaningful visual content descriptions and recognise specific events,
e.g., sports events, depicted in images efficiently.
Since discovering the semantics of an image is an extremely challenging problem, one
promising approach to enhance visual content interpretation is to use any associated
textual information that accompanies an image, as a cue to predict the meaning of an
image, by transforming this textual information into a structured annotation for an
image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct
types of information representation and modality, there are some strong, invariant,
implicit, connections between images and any accompanying text information.
Semantic analysis of image captions can be used by image retrieval systems to
retrieve selected images more precisely. To do this, a Natural Language Processing
(NLP) is exploited firstly in order to extract concepts from image captions. Next, an
ontology-based knowledge model is deployed in order to resolve natural language
ambiguities. To deal with the accompanying text information, two methods to extract
knowledge from textual information have been proposed. First, metadata can be
extracted automatically from text captions and restructured with respect to a semantic
model. Second, the use of LSI in relation to a domain-specific ontology-based
knowledge model enables the combined framework to tolerate ambiguities and
variations (incompleteness) of metadata. The use of the ontology-based knowledge
model allows the system to find indirectly relevant concepts in image captions and
thus leverage these to represent the semantics of images at a higher level.
Experimental results show that the proposed framework significantly enhances image
retrieval and leads to narrowing of the semantic gap between lower level machinederived
and higher level human-understandable conceptualisation
ImageSieve: Exploratory search of museum archives with named entity-based faceted browsing
Over the last few years, faceted search emerged as an attractive alternative to the traditional "text box" search and has become one of the standard ways of interaction on many e-commerce sites. However, these applications of faceted search are limited to domains where the objects of interests have already been classified along several independent dimensions, such as price, year, or brand. While automatic approaches to generate faceted search interfaces were proposed, it is not yet clear to what extent the automatically-produced interfaces will be useful to real users, and whether their quality can match or surpass their manually-produced predecessors. The goal of this paper is to introduce an exploratory search interface called ImageSieve, which shares many features with traditional faceted browsing, but can function without the use of traditional faceted metadata. ImageSieve uses automatically extracted and classified named entities, which play important roles in many domains (such as news collections, image archives, etc.). We describe one specific application of ImageSieve for image search. Here, named entities extracted from the descriptions of the retrieved images are used to organize a faceted browsing interface, which then helps users to make sense of and further explore the retrieved images. The results of a user study of ImageSieve demonstrate that a faceted search system based on named entities can help users explore large collections and find relevant information more effectively
Recommended from our members
Explainable and Advisable Learning for Self-driving Vehicles
Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly
Personalization in cultural heritage: the road travelled and the one ahead
Over the last 20 years, cultural heritage has been a favored domain for personalization research. For years, researchers have experimented with the cutting edge
technology of the day; now, with the convergence of internet and wireless technology, and the increasing adoption of the Web as a platform for the publication of information, the visitor is able to exploit cultural heritage material before, during and after the visit, having different goals and requirements in each phase. However, cultural heritage sites have a huge amount of information to present, which must be filtered and personalized in order to enable the individual user to easily access it. Personalization of cultural heritage information requires a system that is able to model the user
(e.g., interest, knowledge and other personal characteristics), as well as contextual aspects, select the most appropriate content, and deliver it in the most suitable way. It should be noted that achieving this result is extremely challenging in the case of first-time users, such as tourists who visit a cultural heritage site for the first time (and maybe the only time in their life). In addition, as tourism is a social activity, adapting to the individual is not enough because groups and communities have to be modeled and supported as well, taking into account their mutual interests, previous mutual experience, and requirements. How to model and represent the user(s) and the context of the visit and how to reason with regard to the information that is available are the challenges faced by researchers in personalization of cultural heritage. Notwithstanding the effort invested so far, a definite solution is far from being reached, mainly because new technology and new aspects of personalization are constantly being introduced. This article surveys the research in this area. Starting from the earlier systems, which presented cultural heritage information in kiosks, it summarizes the evolution of personalization techniques in museum web sites, virtual collections and mobile guides, until recent extension of cultural heritage toward the semantic and social web. The paper concludes with current challenges and points out areas where future research is needed
Semantic web service automation with lightweight annotations
Web services, both RESTful and WSDL-based, are an increasingly important part of the Web. With the application of semantic technologies, we can achieve automation of the use of those services. In this paper, we present WSMO-Lite and MicroWSMO, two related lightweight approaches to semantic Web service description, evolved from the WSMO framework. WSMO-Lite uses SAWSDL to annotate WSDL-based services, whereas MicroWSMO uses the hRESTS microformat to annotate RESTful APIs and services. Both frameworks share an ontology for service semantics together with most of automation algorithms
Ontology mapping by concept similarity
This paper presents an approach to the problem of mapping ontologies. The motivation for the research stems from the Diogene Project which is developing a web training environment for ICT professionals. The system includes high quality training material from registered content providers, and free web material will also be made available through the project's "Web Discovery" component. This involves using web search engines to locate relevant material, and mapping the ontology at the core of the Diogene system to other ontologies that exist on the Semantic Web. The project's approach to ontology mapping is presented, and an evaluation of this method is described
An Ontology-Based Recommender System with an Application to the Star Trek Television Franchise
Collaborative filtering based recommender systems have proven to be extremely
successful in settings where user preference data on items is abundant.
However, collaborative filtering algorithms are hindered by their weakness
against the item cold-start problem and general lack of interpretability.
Ontology-based recommender systems exploit hierarchical organizations of users
and items to enhance browsing, recommendation, and profile construction. While
ontology-based approaches address the shortcomings of their collaborative
filtering counterparts, ontological organizations of items can be difficult to
obtain for items that mostly belong to the same category (e.g., television
series episodes). In this paper, we present an ontology-based recommender
system that integrates the knowledge represented in a large ontology of
literary themes to produce fiction content recommendations. The main novelty of
this work is an ontology-based method for computing similarities between items
and its integration with the classical Item-KNN (K-nearest neighbors)
algorithm. As a study case, we evaluated the proposed method against other
approaches by performing the classical rating prediction task on a collection
of Star Trek television series episodes in an item cold-start scenario. This
transverse evaluation provides insights into the utility of different
information resources and methods for the initial stages of recommender system
development. We found our proposed method to be a convenient alternative to
collaborative filtering approaches for collections of mostly similar items,
particularly when other content-based approaches are not applicable or
otherwise unavailable. Aside from the new methods, this paper contributes a
testbed for future research and an online framework to collaboratively extend
the ontology of literary themes to cover other narrative content.Comment: 25 pages, 6 figures, 5 tables, minor revision
Web Service Discovery Based on Past User Experience
Web service technology provides a way for simplifying interoperability among different organizations. A piece of functionality available as a web service can be involved in a new business process. Given the steadily growing number of available web services, it is hard for developers to find services appropriate for their needs. The main research efforts in this area are oriented on developing a mechanism for semantic web service description and matching. In this paper, we present an alternative approach for supporting users in web service discovery. Our system implements the implicit culture approach for recommending web services to developers based on the history of decisions made by other developers with similar needs. We explain the main ideas underlying our approach and report on experimental results
- …