1,331 research outputs found

    Learning and Interpreting Multi-Multi-Instance Learning Networks

    Get PDF
    We introduce an extension of the multi-instance learning problem where examples are organized as nested bags of instances (e.g., a document could be represented as a bag of sentences, which in turn are bags of words). This framework can be useful in various scenarios, such as text and image classification, but also supervised learning over graphs. As a further advantage, multi-multi instance learning enables a particular way of interpreting predictions and the decision function. Our approach is based on a special neural network layer, called bag-layer, whose units aggregate bags of inputs of arbitrary size. We prove theoretically that the associated class of functions contains all Boolean functions over sets of sets of instances and we provide empirical evidence that functions of this kind can be actually learned on semi-synthetic datasets. We finally present experiments on text classification, on citation graphs, and social graph data, which show that our model obtains competitive results with respect to accuracy when compared to other approaches such as convolutional networks on graphs, while at the same time it supports a general approach to interpret the learnt model, as well as explain individual predictions.Comment: JML

    Audio-Visual Sentiment Analysis for Learning Emotional Arcs in Movies

    Full text link
    Stories can have tremendous power -- not only useful for entertainment, they can activate our interests and mobilize our actions. The degree to which a story resonates with its audience may be in part reflected in the emotional journey it takes the audience upon. In this paper, we use machine learning methods to construct emotional arcs in movies, calculate families of arcs, and demonstrate the ability for certain arcs to predict audience engagement. The system is applied to Hollywood films and high quality shorts found on the web. We begin by using deep convolutional neural networks for audio and visual sentiment analysis. These models are trained on both new and existing large-scale datasets, after which they can be used to compute separate audio and visual emotional arcs. We then crowdsource annotations for 30-second video clips extracted from highs and lows in the arcs in order to assess the micro-level precision of the system, with precision measured in terms of agreement in polarity between the system's predictions and annotators' ratings. These annotations are also used to combine the audio and visual predictions. Next, we look at macro-level characterizations of movies by investigating whether there exist `universal shapes' of emotional arcs. In particular, we develop a clustering approach to discover distinct classes of emotional arcs. Finally, we show on a sample corpus of short web videos that certain emotional arcs are statistically significant predictors of the number of comments a video receives. These results suggest that the emotional arcs learned by our approach successfully represent macroscopic aspects of a video story that drive audience engagement. Such machine understanding could be used to predict audience reactions to video stories, ultimately improving our ability as storytellers to communicate with each other.Comment: Data Mining (ICDM), 2017 IEEE 17th International Conference o

    Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

    Full text link
    Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the meaning of documents by embedding them in a low dimensional vector space, while preserving distinctions of word and sentence order crucial for capturing nuanced semantics. Our model is based on an extended Dynamic Convolution Neural Network, which learns convolution filters at both the sentence and document level, hierarchically learning to capture and compose low level lexical features into high level semantic concepts. We demonstrate the effectiveness of this model on a range of document modelling tasks, achieving strong results with no feature engineering and with a more compact model. Inspired by recent advances in visualising deep convolution networks for computer vision, we present a novel visualisation technique for our document networks which not only provides insight into their learning process, but also can be interpreted to produce a compelling automatic summarisation system for texts

    Explaining Recurrent Neural Network Predictions in Sentiment Analysis

    Full text link
    Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs. We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work.Comment: 9 pages, 4 figures, accepted for EMNLP'17 Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA
    • …
    corecore