25,569 research outputs found
Semantics-based selection of everyday concepts in visual lifelogging
Concept-based indexing, based on identifying various semantic concepts appearing in multimedia, is an attractive option for multimedia retrieval and much research tries to bridge the semantic gap between the media’s low-level features and high-level semantics. Research into concept-based multimedia retrieval has generally focused on detecting concepts from high quality media such as broadcast TV or movies, but it is not well addressed in other domains like lifelogging where the original data is captured with poorer quality. We argue that in noisy domains such as lifelogging, the management of data needs to include semantic reasoning in order to deduce a set of concepts to represent lifelog content for applications like searching, browsing or summarisation. Using semantic concepts to manage lifelog data relies on the fusion of automatically-detected concepts to provide a better understanding of the lifelog data. In this paper, we investigate the selection of semantic concepts for lifelogging which includes reasoning on semantic networks using a density-based approach. In a series of experiments we compare different semantic reasoning approaches and the experimental evaluations we report on lifelog data show the efficacy of our approach
Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Many vision and language tasks require commonsense reasoning beyond
data-driven image and natural language processing. Here we adopt Visual
Question Answering (VQA) as an example task, where a system is expected to
answer a question in natural language about an image. Current state-of-the-art
systems attempted to solve the task using deep neural architectures and
achieved promising performance. However, the resulting systems are generally
opaque and they struggle in understanding questions for which extra knowledge
is required. In this paper, we present an explicit reasoning layer on top of a
set of penultimate neural network based systems. The reasoning layer enables
reasoning and answering questions where additional knowledge is required, and
at the same time provides an interpretable interface to the end users.
Specifically, the reasoning layer adopts a Probabilistic Soft Logic (PSL) based
engine to reason over a basket of inputs: visual relations, the semantic parse
of the question, and background ontological knowledge from word2vec and
ConceptNet. Experimental analysis of the answers and the key evidential
predicates generated on the VQA dataset validate our approach.Comment: 9 pages, 3 figures, AAAI 201
Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation
We present a large-scale collection of diverse natural language inference
(NLI) datasets that help provide insight into how well a sentence
representation captures distinct types of reasoning. The collection results
from recasting 13 existing datasets from 7 semantic phenomena into a common NLI
structure, resulting in over half a million labeled context-hypothesis pairs in
total. We refer to our collection as the DNC: Diverse Natural Language
Inference Collection. The DNC is available online at https://www.decomp.net,
and will grow over time as additional resources are recast and added from novel
sources.Comment: To be presented at EMNLP 2018. 15 page
Developmental Bayesian Optimization of Black-Box with Visual Similarity-Based Transfer Learning
We present a developmental framework based on a long-term memory and
reasoning mechanisms (Vision Similarity and Bayesian Optimisation). This
architecture allows a robot to optimize autonomously hyper-parameters that need
to be tuned from any action and/or vision module, treated as a black-box. The
learning can take advantage of past experiences (stored in the episodic and
procedural memories) in order to warm-start the exploration using a set of
hyper-parameters previously optimized from objects similar to the new unknown
one (stored in a semantic memory). As example, the system has been used to
optimized 9 continuous hyper-parameters of a professional software (Kamido)
both in simulation and with a real robot (industrial robotic arm Fanuc) with a
total of 13 different objects. The robot is able to find a good object-specific
optimization in 68 (simulation) or 40 (real) trials. In simulation, we
demonstrate the benefit of the transfer learning based on visual similarity, as
opposed to an amnesic learning (i.e. learning from scratch all the time).
Moreover, with the real robot, we show that the method consistently outperforms
the manual optimization from an expert with less than 2 hours of training time
to achieve more than 88% of success
- …