7 research outputs found
Deep Non-Monotonic Reasoning for Visual Abstract Reasoning Tasks
While achieving unmatched performance on many well-defined tasks, deep
learning models have also been used to solve visual abstract reasoning tasks,
which are relatively less well-defined, and have been widely used to measure
human intelligence. However, current deep models struggle to match human
abilities to solve such tasks with minimum data but maximum generalization. One
limitation is that current deep learning models work in a monotonic way, i.e.,
treating different parts of the input in essentially fixed orderings, whereas
people repeatedly observe and reason about the different parts of the visual
stimuli until the reasoning process converges to a consistent conclusion, i.e.,
non-monotonic reasoning. This paper proposes a non-monotonic computational
approach to solve visual abstract reasoning tasks. In particular, we
implemented a deep learning model using this approach and tested it on the
RAVEN dataset -- a dataset inspired by the Raven's Progressive Matrices test.
Results show that the proposed approach is more effective than existing
monotonic deep learning models, under strict experimental settings that
represent a difficult variant of the RAVEN dataset problem
Abstract Diagrammatic Reasoning with Multiplex Graph Networks
Abstract reasoning, particularly in the visual domain, is a complex human ability, but it remains a challenging problem for artificial neural learning systems. In this work we propose MXGNet, a multilayer graph neural network for multi-panel diagrammatic reasoning tasks. MXGNet combines three powerful concepts, namely, object-level representation, graph neural networks and multiplex graphs, for solving visual reasoning tasks. MXGNet first extracts object-level representations for each element in all panels of the diagrams, and then forms a multi-layer multiplex graph capturing multiple relations between objects across different diagram panels. MXGNet summarises the multiple graphs extracted from the diagrams of the task, and uses this summarisation to pick the most probable answer from the given candidates. We have tested MXGNet on two types of diagrammatic reasoning tasks, namely Diagram Syllogisms and Raven Progressive Matrices (RPM). For an Euler Diagram Syllogism task MXGNet achieves state-of-the-art accuracy of 99.8%. For PGM and RAVEN, two comprehensive datasets for RPM reasoning, MXGNet outperforms the state-of-the-art models by a considerable margin
Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning
In representation learning, a disentangled representation is highly desirable
as it encodes generative factors of data in a separable and compact pattern.
Researchers have advocated leveraging disentangled representations to complete
downstream tasks with encouraging empirical evidence. This paper further
investigates the necessity of disentangled representation in downstream
applications. Specifically, we show that dimension-wise disentangled
representations are unnecessary on a fundamental downstream task, abstract
visual reasoning. We provide extensive empirical evidence against the necessity
of disentanglement, covering multiple datasets, representation learning
methods, and downstream network architectures. Furthermore, our findings
suggest that the informativeness of representations is a better indicator of
downstream performance than disentanglement. Finally, the positive correlation
between informativeness and disentanglement explains the claimed usefulness of
disentangled representations in previous works. The source code is available at
https://github.com/Richard-coder-Nai/disentanglement-lib-necessity.git.Comment: Accepted to AAAI-202
Towards Efficient Lifelong Machine Learning in Deep Neural Networks
Humans continually learn and adapt to new knowledge and environments throughout their lifetimes. Rarely does learning new information cause humans to catastrophically forget previous knowledge. While deep neural networks (DNNs) now rival human performance on several supervised machine perception tasks, when updated on changing data distributions, they catastrophically forget previous knowledge. Enabling DNNs to learn new information over time opens the door for new applications such as self-driving cars that adapt to seasonal changes or smartphones that adapt to changing user preferences. In this dissertation, we propose new methods and experimental paradigms for efficiently training continual DNNs without forgetting. We then apply these methods to several visual and multi-modal perception tasks including image classification, visual question answering, analogical reasoning, and attribute and relationship prediction in visual scenes