Search CORE

6,856 research outputs found

Auto-Sizing Neural Networks: With Applications to n-gram Language Models

Author: Chiang David
Murray Kenton
Publication venue
Publication date: 01/01/2015
Field of study

Neural networks have been shown to improve performance across a range of natural-language tasks. However, designing and training them can be complicated. Frequently, researchers resort to repeated experimentation to pick optimal settings. In this paper, we address the issue of choosing the correct number of units in hidden layers. We introduce a method for automatically adjusting network size by pruning out hidden units through

\ell_{\infty,1}

and

\ell_{2,1}

regularization. We apply this method to language modeling and demonstrate its ability to correctly choose the number of hidden units while maintaining perplexity. We also include these models in a machine translation decoder and show that these smaller neural models maintain the significant improvements of their unpruned versions.Comment: EMNLP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Compact Personalized Models for Neural Machine Translation

Author: DeNero John
Simianer Patrick
Wuebker Joern
Publication venue
Publication date: 01/01/2018
Field of study

We propose and compare methods for gradient-based domain adaptation of self-attentive neural machine translation models. We demonstrate that a large proportion of model parameters can be frozen during adaptation with minimal or no reduction in translation quality by encouraging structured sparsity in the set of offset tensors during learning via group lasso regularization. We evaluate this technique for both batch and incremental adaptation across multiple data sets and language pairs. Our system architecture - combining a state-of-the-art self-attentive model with compact domain adaptation - provides high quality personalized machine translation that is both space and time efficient.Comment: Published at the 2018 Conference on Empirical Methods in Natural Language Processin

arXiv.org e-Print Archive

Crossref

Recommended from our members

On stopwords, filtering and data sparsity for sentiment analysis of Twitter

Author: Alani Harith
Fernández Miriam
He Yulan
Saif Hassan
Publication venue
Publication date: 01/01/2014
Field of study

Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier’s feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space

Open Research Online (The Open University)

Aston Publications Explorer