Search CORE

2,906 research outputs found

Auto-Sizing Neural Networks: With Applications to n-gram Language Models

Author: Chiang David
Murray Kenton
Publication venue
Publication date: 01/01/2015
Field of study

Neural networks have been shown to improve performance across a range of natural-language tasks. However, designing and training them can be complicated. Frequently, researchers resort to repeated experimentation to pick optimal settings. In this paper, we address the issue of choosing the correct number of units in hidden layers. We introduce a method for automatically adjusting network size by pruning out hidden units through

\ell_{\infty,1}

and

\ell_{2,1}

regularization. We apply this method to language modeling and demonstrate its ability to correctly choose the number of hidden units while maintaining perplexity. We also include these models in a machine translation decoder and show that these smaller neural models maintain the significant improvements of their unpruned versions.Comment: EMNLP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Improving Lexical Choice in Neural Machine Translation

Author: Chiang David
Nguyen Toan Q.
Publication venue
Publication date: 01/01/2018
Field of study

We explore two solutions to the problem of mistranslating rare words in neural machine translation. First, we argue that the standard output layer, which computes the inner product of a vector representing the context with all possible output word embeddings, rewards frequent words disproportionately, and we propose to fix the norms of both vectors to a constant value. Second, we integrate a simple lexical module which is jointly trained with the rest of the model. We evaluate our approaches on eight language pairs with data sizes ranging from 100k to 8M words, and achieve improvements of up to +4.3 BLEU, surpassing phrase-based translation in nearly all settings.Comment: Accepted at NAACL HLT 201

arXiv.org e-Print Archive

Crossref

Heterogeneous Congestion Control: Efficiency, Fairness and Design

Author: Chiang Mung
Low Steven H.
Tang Ao
Wei David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

When heterogeneous congestion control protocols that react to different pricing signals (e.g. packet loss, queueing delay, ECN marking etc.) share the same network, the current theory based on utility maximization fails to predict the network behavior. Unlike in a homogeneous network, the bandwidth allocation now depends on router parameters and flow arrival patterns. It can be non-unique, inefficient and unfair. This paper has two objectives. First, we demonstrate the intricate behaviors of a heterogeneous network through simulations and present a rigorous framework to help understand its equilibrium efficiency and fairness properties. By identifying an optimization problem associated with every equilibrium, we show that every equilibrium is Pareto efficient and provide an upper bound on efficiency loss due to pricing heterogeneity. On fairness, we show that intra-protocol fairness is still decided by a utility maximization problem while inter-protocol fairness is the part over which we donÂ¿t have control. However it is shown that we can achieve any desirable inter-protocol fairness by properly choosing protocol parameters. Second, we propose a simple slow timescale source-based algorithm to decouple bandwidth allocation from router parameters and flow arrival patterns and prove its feasibility. The scheme needs only local information

CiteSeerX

Crossref

Caltech Authors