22,084 research outputs found

    Improving the minimum description length inference of phrase-based translation models

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-19390-8_25We study the application of minimum description length (MDL) inference to estimate pattern recognition models for machine translation. MDL is a theoretically-sound approach whose empirical results are however below those of the state-of-the-art pipeline of training heuristics. We identify potential limitations of current MDL procedures and provide a practical approach to overcome them. Empirical results support the soundness of the proposed approach.Work supported by the EU 7th Framework Programme (FP7/2007–2013) under the CasMaCat project (grant agreement no 287576), by Spanish MICINN under grant TIN2012-31723, and by the Generalitat Valenciana under grant ALMPR (Prometeo/2009/014).Gonzalez Rubio, J.; Casacuberta Nolla, F. (2015). Improving the minimum description length inference of phrase-based translation models. En Pattern Recognition and Image Analysis: 7th Iberian Conference, IbPRIA 2015, Santiago de Compostela, Spain, June 17-19, 2015, Proceedings. Springer International Publishing. 219-227. https://doi.org/10.1007/978-3-319-19390-8 25S21922

    Inducing Probabilistic Grammars by Bayesian Model Merging

    Full text link
    We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are {\em incorporated} by adding ad-hoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are {\em merged} to achieve generalization and a more compact representation. The choice of what to merge and when to stop is governed by the Bayesian posterior probability of the grammar given the data, which formalizes a trade-off between a close fit to the data and a default preference for simpler models (`Occam's Razor'). The general scheme is illustrated using three types of probabilistic grammars: Hidden Markov models, class-based nn-grams, and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13 page

    Factored Translation Models

    Get PDF

    Is there a physically universal cellular automaton or Hamiltonian?

    Full text link
    It is known that both quantum and classical cellular automata (CA) exist that are computationally universal in the sense that they can simulate, after appropriate initialization, any quantum or classical computation, respectively. Here we introduce a different notion of universality: a CA is called physically universal if every transformation on any finite region can be (approximately) implemented by the autonomous time evolution of the system after the complement of the region has been initialized in an appropriate way. We pose the question of whether physically universal CAs exist. Such CAs would provide a model of the world where the boundary between a physical system and its controller can be consistently shifted, in analogy to the Heisenberg cut for the quantum measurement problem. We propose to study the thermodynamic cost of computation and control within such a model because implementing a cyclic process on a microsystem may require a non-cyclic process for its controller, whereas implementing a cyclic process on system and controller may require the implementation of a non-cyclic process on a "meta"-controller, and so on. Physically universal CAs avoid this infinite hierarchy of controllers and the cost of implementing cycles on a subsystem can be described by mixing properties of the CA dynamics. We define a physical prior on the CA configurations by applying the dynamics to an initial state where half of the CA is in the maximum entropy state and half of it is in the all-zero state (thus reflecting the fact that life requires non-equilibrium states like the boundary between a hold and a cold reservoir). As opposed to Solomonoff's prior, our prior does not only account for the Kolmogorov complexity but also for the cost of isolating the system during the state preparation if the preparation process is not robust.Comment: 27 pages, 1 figur

    COMIC: Towards A Compact Image Captioning Model with Attention

    Full text link
    Recent works in image captioning have shown very promising raw performance. However, we realize that most of these encoder-decoder style networks with attention do not scale naturally to large vocabulary size, making them difficult to be deployed on embedded system with limited hardware resources. This is because the size of word and output embedding matrices grow proportionally with the size of vocabulary, adversely affecting the compactness of these networks. To address this limitation, this paper introduces a brand new idea in the domain of image captioning. That is, we tackle the problem of compactness of image captioning models which is hitherto unexplored. We showed that, our proposed model, named COMIC for COMpact Image Captioning, achieves comparable results in five common evaluation metrics with state-of-the-art approaches on both MS-COCO and InstaPIC-1.1M datasets despite having an embedding vocabulary size that is 39x - 99x smaller. The source code and models are available at: https://github.com/jiahuei/COMIC-Compact-Image-Captioning-with-AttentionComment: Added source code link and new results in Table
    corecore