2,917 research outputs found
Auto-Sizing Neural Networks: With Applications to n-gram Language Models
Neural networks have been shown to improve performance across a range of
natural-language tasks. However, designing and training them can be
complicated. Frequently, researchers resort to repeated experimentation to pick
optimal settings. In this paper, we address the issue of choosing the correct
number of units in hidden layers. We introduce a method for automatically
adjusting network size by pruning out hidden units through
and regularization. We apply this method to language modeling and
demonstrate its ability to correctly choose the number of hidden units while
maintaining perplexity. We also include these models in a machine translation
decoder and show that these smaller neural models maintain the significant
improvements of their unpruned versions.Comment: EMNLP 201
Improving Lexical Choice in Neural Machine Translation
We explore two solutions to the problem of mistranslating rare words in
neural machine translation. First, we argue that the standard output layer,
which computes the inner product of a vector representing the context with all
possible output word embeddings, rewards frequent words disproportionately, and
we propose to fix the norms of both vectors to a constant value. Second, we
integrate a simple lexical module which is jointly trained with the rest of the
model. We evaluate our approaches on eight language pairs with data sizes
ranging from 100k to 8M words, and achieve improvements of up to +4.3 BLEU,
surpassing phrase-based translation in nearly all settings.Comment: Accepted at NAACL HLT 201
Heterogeneous Congestion Control: Efficiency, Fairness and Design
When heterogeneous congestion control protocols that react to different pricing signals (e.g. packet loss, queueing delay, ECN marking etc.) share the same network, the current theory based on utility maximization fails to predict the network behavior. Unlike in a homogeneous network, the bandwidth allocation now depends on router parameters and flow arrival patterns. It can be non-unique, inefficient and unfair. This paper has two objectives. First, we demonstrate the intricate behaviors of a heterogeneous network through simulations and present a rigorous framework to help understand its equilibrium efficiency and fairness properties. By identifying an optimization problem associated with every equilibrium, we show that every equilibrium is Pareto efficient and provide an upper bound on efficiency loss due to pricing heterogeneity. On fairness, we show that intra-protocol fairness is still decided by a utility maximization problem while inter-protocol fairness is the part over which we don¿t have control. However it is shown that we can achieve any desirable inter-protocol fairness by properly choosing protocol parameters. Second, we propose a simple slow timescale source-based algorithm to decouple bandwidth allocation from router parameters and flow arrival patterns and prove its feasibility. The scheme needs only local information
- …