18,572 research outputs found

    Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

    Full text link
    Visual question answering requires high-order reasoning about an image, which is a fundamental capability needed by machine systems to follow complex directives. Recently, modular networks have been shown to be an effective framework for performing visual reasoning tasks. While modular networks were initially designed with a degree of model transparency, their performance on complex visual reasoning benchmarks was lacking. Current state-of-the-art approaches do not provide an effective mechanism for understanding the reasoning process. In this paper, we close the performance gap between interpretable models and state-of-the-art visual reasoning methods. We propose a set of visual-reasoning primitives which, when composed, manifest as a model capable of performing complex reasoning tasks in an explicitly-interpretable manner. The fidelity and interpretability of the primitives' outputs enable an unparalleled ability to diagnose the strengths and weaknesses of the resulting model. Critically, we show that these primitives are highly performant, achieving state-of-the-art accuracy of 99.1% on the CLEVR dataset. We also show that our model is able to effectively learn generalized representations when provided a small amount of data containing novel object attributes. Using the CoGenT generalization task, we show more than a 20 percentage point improvement over the current state of the art.Comment: CVPR 2018 pre-prin

    Machine Learning and the Future of Realism

    Get PDF
    The preceding three decades have seen the emergence, rise, and proliferation of machine learning (ML). From half-recognised beginnings in perceptrons, neural nets, and decision trees, algorithms that extract correlations (that is, patterns) from a set of data points have broken free from their origin in computational cognition to embrace all forms of problem solving, from voice recognition to medical diagnosis to automated scientific research and driverless cars, and it is now widely opined that the real industrial revolution lies less in mobile phone and similar than in the maturation and universal application of ML. Among the consequences just might be the triumph of anti-realism over realism

    The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision

    Full text link
    We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that learns visual concepts, words, and semantic parsing of sentences without explicit supervision on any of them; instead, our model learns by simply looking at images and reading paired questions and answers. Our model builds an object-based scene representation and translates sentences into executable, symbolic programs. To bridge the learning of two modules, we use a neuro-symbolic reasoning module that executes these programs on the latent scene representation. Analogical to human concept learning, the perception module learns visual concepts based on the language description of the object being referred to. Meanwhile, the learned visual concepts facilitate learning new words and parsing new sentences. We use curriculum learning to guide the searching over the large compositional space of images and language. Extensive experiments demonstrate the accuracy and efficiency of our model on learning visual concepts, word representations, and semantic parsing of sentences. Further, our method allows easy generalization to new object attributes, compositions, language concepts, scenes and questions, and even new program domains. It also empowers applications including visual question answering and bidirectional image-text retrieval.Comment: ICLR 2019 (Oral). Project page: http://nscl.csail.mit.edu

    Opening the Black Box of Financial AI with CLEAR-Trade: A CLass-Enhanced Attentive Response Approach for Explaining and Visualizing Deep Learning-Driven Stock Market Prediction

    Get PDF
    Deep learning has been shown to outperform traditional machine learning algorithms across a wide range of problem domains. However, current deep learning algorithms have been criticized as uninterpretable "black-boxes" which cannot explain their decision making processes. This is a major shortcoming that prevents the widespread application of deep learning to domains with regulatory processes such as finance. As such, industries such as finance have to rely on traditional models like decision trees that are much more interpretable but less effective than deep learning for complex problems. In this paper, we propose CLEAR-Trade, a novel financial AI visualization framework for deep learning-driven stock market prediction that mitigates the interpretability issue of deep learning methods. In particular, CLEAR-Trade provides a effective way to visualize and explain decisions made by deep stock market prediction models. We show the efficacy of CLEAR-Trade in enhancing the interpretability of stock market prediction by conducting experiments based on S&P 500 stock index prediction. The results demonstrate that CLEAR-Trade can provide significant insight into the decision-making process of deep learning-driven financial models, particularly for regulatory processes, thus improving their potential uptake in the financial industry
    • …
    corecore