22,084 research outputs found
Improving the minimum description length inference of phrase-based translation models
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-19390-8_25We study the application of minimum description length
(MDL) inference to estimate pattern recognition models for machine
translation. MDL is a theoretically-sound approach whose empirical
results are however below those of the state-of-the-art pipeline of training
heuristics. We identify potential limitations of current MDL procedures
and provide a practical approach to overcome them. Empirical results
support the soundness of the proposed approach.Work supported by the EU 7th Framework Programme (FP7/2007–2013) under the CasMaCat project (grant agreement no 287576), by Spanish MICINN under grant TIN2012-31723, and by the Generalitat Valenciana under grant ALMPR (Prometeo/2009/014).Gonzalez Rubio, J.; Casacuberta Nolla, F. (2015). Improving the minimum description length inference of phrase-based translation models. En Pattern Recognition and Image Analysis: 7th Iberian Conference, IbPRIA 2015, Santiago de Compostela, Spain, June 17-19, 2015, Proceedings. Springer International Publishing. 219-227. https://doi.org/10.1007/978-3-319-19390-8 25S21922
Inducing Probabilistic Grammars by Bayesian Model Merging
We describe a framework for inducing probabilistic grammars from corpora of
positive samples. First, samples are {\em incorporated} by adding ad-hoc rules
to a working grammar; subsequently, elements of the model (such as states or
nonterminals) are {\em merged} to achieve generalization and a more compact
representation. The choice of what to merge and when to stop is governed by the
Bayesian posterior probability of the grammar given the data, which formalizes
a trade-off between a close fit to the data and a default preference for
simpler models (`Occam's Razor'). The general scheme is illustrated using three
types of probabilistic grammars: Hidden Markov models, class-based -grams,
and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second
International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13
page
Is there a physically universal cellular automaton or Hamiltonian?
It is known that both quantum and classical cellular automata (CA) exist that
are computationally universal in the sense that they can simulate, after
appropriate initialization, any quantum or classical computation, respectively.
Here we introduce a different notion of universality: a CA is called physically
universal if every transformation on any finite region can be (approximately)
implemented by the autonomous time evolution of the system after the complement
of the region has been initialized in an appropriate way. We pose the question
of whether physically universal CAs exist. Such CAs would provide a model of
the world where the boundary between a physical system and its controller can
be consistently shifted, in analogy to the Heisenberg cut for the quantum
measurement problem. We propose to study the thermodynamic cost of computation
and control within such a model because implementing a cyclic process on a
microsystem may require a non-cyclic process for its controller, whereas
implementing a cyclic process on system and controller may require the
implementation of a non-cyclic process on a "meta"-controller, and so on.
Physically universal CAs avoid this infinite hierarchy of controllers and the
cost of implementing cycles on a subsystem can be described by mixing
properties of the CA dynamics. We define a physical prior on the CA
configurations by applying the dynamics to an initial state where half of the
CA is in the maximum entropy state and half of it is in the all-zero state
(thus reflecting the fact that life requires non-equilibrium states like the
boundary between a hold and a cold reservoir). As opposed to Solomonoff's
prior, our prior does not only account for the Kolmogorov complexity but also
for the cost of isolating the system during the state preparation if the
preparation process is not robust.Comment: 27 pages, 1 figur
COMIC: Towards A Compact Image Captioning Model with Attention
Recent works in image captioning have shown very promising raw performance.
However, we realize that most of these encoder-decoder style networks with
attention do not scale naturally to large vocabulary size, making them
difficult to be deployed on embedded system with limited hardware resources.
This is because the size of word and output embedding matrices grow
proportionally with the size of vocabulary, adversely affecting the compactness
of these networks. To address this limitation, this paper introduces a brand
new idea in the domain of image captioning. That is, we tackle the problem of
compactness of image captioning models which is hitherto unexplored. We showed
that, our proposed model, named COMIC for COMpact Image Captioning, achieves
comparable results in five common evaluation metrics with state-of-the-art
approaches on both MS-COCO and InstaPIC-1.1M datasets despite having an
embedding vocabulary size that is 39x - 99x smaller. The source code and models
are available at:
https://github.com/jiahuei/COMIC-Compact-Image-Captioning-with-AttentionComment: Added source code link and new results in Table
- …