2,906 research outputs found

    Auto-Sizing Neural Networks: With Applications to n-gram Language Models

    Full text link
    Neural networks have been shown to improve performance across a range of natural-language tasks. However, designing and training them can be complicated. Frequently, researchers resort to repeated experimentation to pick optimal settings. In this paper, we address the issue of choosing the correct number of units in hidden layers. We introduce a method for automatically adjusting network size by pruning out hidden units through ℓ∞,1\ell_{\infty,1} and ℓ2,1\ell_{2,1} regularization. We apply this method to language modeling and demonstrate its ability to correctly choose the number of hidden units while maintaining perplexity. We also include these models in a machine translation decoder and show that these smaller neural models maintain the significant improvements of their unpruned versions.Comment: EMNLP 201

    Improving Lexical Choice in Neural Machine Translation

    Full text link
    We explore two solutions to the problem of mistranslating rare words in neural machine translation. First, we argue that the standard output layer, which computes the inner product of a vector representing the context with all possible output word embeddings, rewards frequent words disproportionately, and we propose to fix the norms of both vectors to a constant value. Second, we integrate a simple lexical module which is jointly trained with the rest of the model. We evaluate our approaches on eight language pairs with data sizes ranging from 100k to 8M words, and achieve improvements of up to +4.3 BLEU, surpassing phrase-based translation in nearly all settings.Comment: Accepted at NAACL HLT 201

    Heterogeneous Congestion Control: Efficiency, Fairness and Design

    Get PDF
    When heterogeneous congestion control protocols that react to different pricing signals (e.g. packet loss, queueing delay, ECN marking etc.) share the same network, the current theory based on utility maximization fails to predict the network behavior. Unlike in a homogeneous network, the bandwidth allocation now depends on router parameters and flow arrival patterns. It can be non-unique, inefficient and unfair. This paper has two objectives. First, we demonstrate the intricate behaviors of a heterogeneous network through simulations and present a rigorous framework to help understand its equilibrium efficiency and fairness properties. By identifying an optimization problem associated with every equilibrium, we show that every equilibrium is Pareto efficient and provide an upper bound on efficiency loss due to pricing heterogeneity. On fairness, we show that intra-protocol fairness is still decided by a utility maximization problem while inter-protocol fairness is the part over which we don¿t have control. However it is shown that we can achieve any desirable inter-protocol fairness by properly choosing protocol parameters. Second, we propose a simple slow timescale source-based algorithm to decouple bandwidth allocation from router parameters and flow arrival patterns and prove its feasibility. The scheme needs only local information
    • …
    corecore