18,572 research outputs found
Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning
Visual question answering requires high-order reasoning about an image, which
is a fundamental capability needed by machine systems to follow complex
directives. Recently, modular networks have been shown to be an effective
framework for performing visual reasoning tasks. While modular networks were
initially designed with a degree of model transparency, their performance on
complex visual reasoning benchmarks was lacking. Current state-of-the-art
approaches do not provide an effective mechanism for understanding the
reasoning process. In this paper, we close the performance gap between
interpretable models and state-of-the-art visual reasoning methods. We propose
a set of visual-reasoning primitives which, when composed, manifest as a model
capable of performing complex reasoning tasks in an explicitly-interpretable
manner. The fidelity and interpretability of the primitives' outputs enable an
unparalleled ability to diagnose the strengths and weaknesses of the resulting
model. Critically, we show that these primitives are highly performant,
achieving state-of-the-art accuracy of 99.1% on the CLEVR dataset. We also show
that our model is able to effectively learn generalized representations when
provided a small amount of data containing novel object attributes. Using the
CoGenT generalization task, we show more than a 20 percentage point improvement
over the current state of the art.Comment: CVPR 2018 pre-prin
Machine Learning and the Future of Realism
The preceding three decades have seen the emergence, rise, and proliferation
of machine learning (ML). From half-recognised beginnings in perceptrons,
neural nets, and decision trees, algorithms that extract correlations (that is,
patterns) from a set of data points have broken free from their origin in
computational cognition to embrace all forms of problem solving, from voice
recognition to medical diagnosis to automated scientific research and
driverless cars, and it is now widely opined that the real industrial
revolution lies less in mobile phone and similar than in the maturation and
universal application of ML. Among the consequences just might be the triumph
of anti-realism over realism
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that learns
visual concepts, words, and semantic parsing of sentences without explicit
supervision on any of them; instead, our model learns by simply looking at
images and reading paired questions and answers. Our model builds an
object-based scene representation and translates sentences into executable,
symbolic programs. To bridge the learning of two modules, we use a
neuro-symbolic reasoning module that executes these programs on the latent
scene representation. Analogical to human concept learning, the perception
module learns visual concepts based on the language description of the object
being referred to. Meanwhile, the learned visual concepts facilitate learning
new words and parsing new sentences. We use curriculum learning to guide the
searching over the large compositional space of images and language. Extensive
experiments demonstrate the accuracy and efficiency of our model on learning
visual concepts, word representations, and semantic parsing of sentences.
Further, our method allows easy generalization to new object attributes,
compositions, language concepts, scenes and questions, and even new program
domains. It also empowers applications including visual question answering and
bidirectional image-text retrieval.Comment: ICLR 2019 (Oral). Project page: http://nscl.csail.mit.edu
Opening the Black Box of Financial AI with CLEAR-Trade: A CLass-Enhanced Attentive Response Approach for Explaining and Visualizing Deep Learning-Driven Stock Market Prediction
Deep learning has been shown to outperform traditional machine learning
algorithms across a wide range of problem domains. However, current deep
learning algorithms have been criticized as uninterpretable "black-boxes" which
cannot explain their decision making processes. This is a major shortcoming
that prevents the widespread application of deep learning to domains with
regulatory processes such as finance. As such, industries such as finance have
to rely on traditional models like decision trees that are much more
interpretable but less effective than deep learning for complex problems. In
this paper, we propose CLEAR-Trade, a novel financial AI visualization
framework for deep learning-driven stock market prediction that mitigates the
interpretability issue of deep learning methods. In particular, CLEAR-Trade
provides a effective way to visualize and explain decisions made by deep stock
market prediction models. We show the efficacy of CLEAR-Trade in enhancing the
interpretability of stock market prediction by conducting experiments based on
S&P 500 stock index prediction. The results demonstrate that CLEAR-Trade can
provide significant insight into the decision-making process of deep
learning-driven financial models, particularly for regulatory processes, thus
improving their potential uptake in the financial industry
- …