80,364 research outputs found

    Looking Deeper into Deep Learning Model: Attribution-based Explanations of TextCNN

    Get PDF
    Layer-wise Relevance Propagation (LRP) and saliency maps have been recently used to explain the predictions of Deep Learning models, specifically in the domain of text classification. Given different attribution-based explanations to highlight relevant words for a predicted class label, experiments based on word deleting perturbation is a common evaluation method. This word removal approach, however, disregards any linguistic dependencies that may exist between words or phrases in a sentence, which could semantically guide a classifier to a particular prediction. In this paper, we present a feature-based evaluation framework for comparing the two attribution methods on customer reviews (public data sets) and Customer Due Diligence (CDD) extracted reports (corporate data set). Instead of removing words based on the relevance score, we investigate perturbations based on embedded features removal from intermediate layers of Convolutional Neural Networks. Our experimental study is carried out on embedded-word, embedded-document, and embedded-ngrams explanations. Using the proposed framework, we provide a visualization tool to assist analysts in reasoning toward the model's final prediction.Comment: NIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy, Montr\'eal, Canad

    Using deep learning to detect social media ‘trolls’

    Get PDF
    Detecting criminal activity online is not a new concept but how it can occur is changing. Technology and the influx of social media applications and platforms has a vital part to play in this changing landscape. As such, we observe an increasing problem with cyber abuse and ‘trolling’/toxicity amongst social media platforms sharing stories, posts, memes sharing content. In this paper we present our work into the application of deep learning techniques for the detection of ‘trolls’ and toxic content shared on social media platforms. We propose a machine learning solution for the detection of toxic images based on embedded text content. The project utilizes GloVe word embeddings for data augmentation for improved prediction capabilities. Our methodology details the implementation of Long Short-term memory Gated recurrent unit models and their Bidirectional variants, comparing our approach to related works, and highlighting evident improvements. Our experiments revealed that the best performing model, Bidirectional LSTM, achieved 0.92 testing accuracy and 0.88 inference accuracy with 0.92 and 0.88 F1-score accordingly
    corecore