Search CORE

2,262 research outputs found

A Practitioners' Guide to Transfer Learning for Text Classification using Convolutional Neural Networks

Author: Mathur Gaurav
Nair Shivashankar B.
Semwal Tushar
Yenigalla Promod
Publication venue
Publication date: 19/01/2018
Field of study

Transfer Learning (TL) plays a crucial role when a given dataset has insufficient labeled examples to train an accurate model. In such scenarios, the knowledge accumulated within a model pre-trained on a source dataset can be transferred to a target dataset, resulting in the improvement of the target model. Though TL is found to be successful in the realm of image-based applications, its impact and practical use in Natural Language Processing (NLP) applications is still a subject of research. Due to their hierarchical architecture, Deep Neural Networks (DNN) provide flexibility and customization in adjusting their parameters and depth of layers, thereby forming an apt area for exploiting the use of TL. In this paper, we report the results and conclusions obtained from extensive empirical experiments using a Convolutional Neural Network (CNN) and try to uncover thumb rules to ensure a successful positive transfer. In addition, we also highlight the flawed means that could lead to a negative transfer. We explore the transferability of various layers and describe the effect of varying hyper-parameters on the transfer performance. Also, we present a comparison of accuracy value and model size against state-of-the-art methods. Finally, we derive inferences from the empirical results and provide best practices to achieve a successful positive transfer.Comment: 9 pages, 2 figures, accepted in SDM 201

arXiv.org e-Print Archive

Crossref

Methods for Interpreting and Understanding Deep Neural Networks

Author: Montavon Grégoire
Müller Klaus-Robert
Samek Wojciech
Publication venue: 'Elsevier BV'
Publication date: 24/06/2017
Field of study

This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. It introduces some recently proposed techniques of interpretation, along with theory, tricks and recommendations, to make most efficient use of these techniques on real data. It also discusses a number of practical applications.Comment: 14 pages, 10 figure

arXiv.org e-Print Archive

Fraunhofer-ePrints

MPG.PuRe

Cross-Domain Labeled LDA for Cross-Domain Text Classification

Author: Jing Baoyu
Lu Chenwei
Niu Cheng
Wang Deqing
Zhuang Fuzhen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/09/2018
Field of study

Cross-domain text classification aims at building a classifier for a target domain which leverages data from both source and target domain. One promising idea is to minimize the feature distribution differences of the two domains. Most existing studies explicitly minimize such differences by an exact alignment mechanism (aligning features by one-to-one feature alignment, projection matrix etc.). Such exact alignment, however, will restrict models' learning ability and will further impair models' performance on classification tasks when the semantic distributions of different domains are very different. To address this problem, we propose a novel group alignment which aligns the semantics at group level. In addition, to help the model learn better semantic groups and semantics within these groups, we also propose a partial supervision for model's learning in source domain. To this end, we embed the group alignment and a partial supervision into a cross-domain topic model, and propose a Cross-Domain Labeled LDA (CDL-LDA). On the standard 20Newsgroup and Reuters dataset, extensive quantitative (classification, perplexity etc.) and qualitative (topic detection) experiments are conducted to show the effectiveness of the proposed group alignment and partial supervision.Comment: ICDM 201

arXiv.org e-Print Archive

Crossref