1 research outputs found
Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation
Semantic Image Interpretation is the task of extracting a structured semantic
description from images. This requires the detection of visual relationships:
triples (subject,relation,object) describing a semantic relation between a
subject and an object. A pure supervised approach to visual relationship
detection requires a complete and balanced training set for all the possible
combinations of (subject, relation, object). However, such training sets are
not available and would require a prohibitive human effort. This implies the
ability of predicting triples which do not appear in the training set. This
problem is called zero-shot learning. State-of-the-art approaches to zero-shot
learning exploit similarities among relationships in the training set or
external linguistic knowledge. In this paper, we perform zero-shot learning by
using Logic Tensor Networks, a novel Statistical Relational Learning framework
that exploits both the similarities with other seen relationships and
background knowledge, expressed with logical constraints between subjects,
relations and objects. The experiments on the Visual Relationship Dataset show
that the use of logical constraints outperforms the current methods. This
implies that background knowledge can be used to alleviate the incompleteness
of training sets