Search CORE

1,784 research outputs found

Semantic bottleneck for computer vision tasks

Author: Gabriëlle Ras
LA Hendricks
MD Zeiler
T-Y Lin
X Lin
Publication venue
Publication date: 06/11/2018
Field of study

This paper introduces a novel method for the representation of images that is semantic by nature, addressing the question of computation intelligibility in computer vision tasks. More specifically, our proposition is to introduce what we call a semantic bottleneck in the processing pipeline, which is a crossing point in which the representation of the image is entirely expressed with natural language , while retaining the efficiency of numerical representations. We show that our approach is able to generate semantic representations that give state-of-the-art results on semantic content-based image retrieval and also perform very well on image classification tasks. Intelligibility is evaluated through user centered experiments for failure detection

arXiv.org e-Print Archive

Goal-Oriented Visual Question Generation via Intermediate Rewards

Author: Lu J.
Shen C.
van den Hengel A.
Wu Q.
Zhang J.
Zhang J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

© 2018, Springer Nature Switzerland AG. Despite significant progress in a variety of vision-and-language problems, developing a method capable of asking intelligent, goal-oriented questions about images is proven to be an inscrutable challenge. Towards this end, we propose a Deep Reinforcement Learning framework based on three new intermediate rewards, namely goal-achieved, progressive and informativeness that encourage the generation of succinct questions, which in turn uncover valuable information towards the overall goal. By directly optimizing for questions that work quickly towards fulfilling the overall goal, we avoid the tendency of existing methods to generate long series of inane queries that add little value. We evaluate our model on the GuessWhat?! dataset and show that the resulting questions can help a standard ‘Guesser’ identify a specific object in an image at a much higher success rate

OPUS - University of Technology Sydney

Visual Question Answering as Reading Comprehension

Author: Hengel Anton van den
Li Hui
Shen Chunhua
Wang Peng
Publication venue
Publication date: 28/11/2018
Field of study

Visual question answering (VQA) demands simultaneous comprehension of both the image visual content and natural language questions. In some cases, the reasoning needs the help of common sense or general knowledge which usually appear in the form of text. Current methods jointly embed both the visual information and the textual feature into the same space. However, how to model the complex interactions between the two different modalities is not an easy task. In contrast to struggling on multimodal feature fusion, in this paper, we propose to unify all the input information by natural language so as to convert VQA into a machine reading comprehension problem. With this transformation, our method not only can tackle VQA datasets that focus on observation based questions, but can also be naturally extended to handle knowledge-based VQA which requires to explore large-scale external knowledge base. It is a step towards being able to exploit large volumes of text and natural language processing techniques to address VQA problem. Two types of models are proposed to deal with open-ended VQA and multiple-choice VQA respectively. We evaluate our models on three VQA benchmarks. The comparable performance with the state-of-the-art demonstrates the effectiveness of the proposed method

arXiv.org e-Print Archive

Business-Driven Data Recommender System:Design and Implementation

Author: Burnay Corentin
Linden Isabelle
Pinon Sarah
Publication venue
Publication date: 01/01/2023
Field of study

A Survey on Knowledge Graphs: Representation, Acquisition and Applications

Author: Cambria Erik
Ji Shaoxiong
Marttinen Pekka
Pan Shirui
Yu Philip S.
Publication venue
Publication date: 17/01/2021
Field of study

Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction towards cognition and human-level intelligence. In this survey, we provide a comprehensive review of knowledge graph covering overall research topics about 1) knowledge graph representation learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph, and 4) knowledge-aware applications, and summarize recent breakthroughs and perspective directions to facilitate future research. We propose a full-view categorization and new taxonomies on these topics. Knowledge graph embedding is organized from four aspects of representation space, scoring function, encoding models, and auxiliary information. For knowledge acquisition, especially knowledge graph completion, embedding methods, path inference, and logical rule reasoning, are reviewed. We further explore several emerging topics, including meta relational learning, commonsense reasoning, and temporal knowledge graphs. To facilitate future research on knowledge graphs, we also provide a curated collection of datasets and open-source libraries on different tasks. In the end, we have a thorough outlook on several promising research directions

arXiv.org e-Print Archive

OPUS - University of Technology Sydney