1,333 research outputs found
Probabilistic Parsing Strategies
We present new results on the relation between purely symbolic context-free
parsing strategies and their probabilistic counter-parts. Such parsing
strategies are seen as constructions of push-down devices from grammars. We
show that preservation of probability distribution is possible under two
conditions, viz. the correct-prefix property and the property of strong
predictiveness. These results generalize existing results in the literature
that were obtained by considering parsing strategies in isolation. From our
general results we also derive negative results on so-called generalized LR
parsing.Comment: 36 pages, 1 figur
Computation of moments for probabilistic finite-state automata
[EN] The computation of moments of probabilistic finite-state automata (PFA) is researched in this article. First, the computation of moments of the length of the paths is introduced for general PFA, and then, the computation of moments of the number of times that a symbol appears in the strings generated by the PFA is described. These computations require a matrix inversion. Acyclic PFA, such as word graphs, are quite common in many practical applications. Algorithms for the efficient computation of the moments for acyclic PFA are also presented in this paper.This work has been partially supported by the Ministerio de Ciencia y Tecnologia under the grant TIN2017-91452-EXP (IBEM), by the Generalitat Valenciana under the grant PROMETE0/2019/121 (DeepPattern), and by the grant "Ayudas Fundacion BBVA a equipos de investigacion cientifica 2018" (PR[8]_HUM_C2_0087).Sánchez Peiró, JA.; Romero, V. (2020). Computation of moments for probabilistic finite-state automata. Information Sciences. 516:388-400. https://doi.org/10.1016/j.ins.2019.12.052S388400516Sakakibara, Y., Brown, M., Hughey, R., Mian, I. S., Sjölander, K., Underwood, R. C., & Haussler, D. (1994). Stochastic context-free grammers for tRNA modeling. Nucleic Acids Research, 22(23), 5112-5120. doi:10.1093/nar/22.23.5112Álvaro, F., Sánchez, J.-A., & Benedí, J.-M. (2016). An integrated grammar-based approach for mathematical expression recognition. Pattern Recognition, 51, 135-147. doi:10.1016/j.patcog.2015.09.013Mohri, M., Pereira, F., & Riley, M. (2002). Weighted finite-state transducers in speech recognition. Computer Speech & Language, 16(1), 69-88. doi:10.1006/csla.2001.0184Casacuberta, F., & Vidal, E. (2004). Machine Translation with Inferred Stochastic Finite-State Transducers. Computational Linguistics, 30(2), 205-225. doi:10.1162/089120104323093294Ortmanns, S., Ney, H., & Aubert, X. (1997). A word graph algorithm for large vocabulary continuous speech recognition. Computer Speech & Language, 11(1), 43-72. doi:10.1006/csla.1996.0022Soule, S. (1974). Entropies of probabilistic grammars. Information and Control, 25(1), 57-74. doi:10.1016/s0019-9958(74)90799-2Justesen, J., & Larsen, K. J. (1975). On probabilistic context-free grammars that achieve capacity. Information and Control, 29(3), 268-285. doi:10.1016/s0019-9958(75)90437-4Hernando, D., Crespi, V., & Cybenko, G. (2005). Efficient Computation of the Hidden Markov Model Entropy for a Given Observation Sequence. IEEE Transactions on Information Theory, 51(7), 2681-2685. doi:10.1109/tit.2005.850223Nederhof, M.-J., & Satta, G. (2008). Computation of distances for regular and context-free probabilistic languages. Theoretical Computer Science, 395(2-3), 235-254. doi:10.1016/j.tcs.2008.01.010CORTES, C., MOHRI, M., RASTOGI, A., & RILEY, M. (2008). ON THE COMPUTATION OF THE RELATIVE ENTROPY OF PROBABILISTIC AUTOMATA. International Journal of Foundations of Computer Science, 19(01), 219-242. doi:10.1142/s0129054108005644Ilic, V. M., Stankovi, M. S., & Todorovic, B. T. (2011). Entropy Message Passing. IEEE Transactions on Information Theory, 57(1), 375-380. doi:10.1109/tit.2010.2090235Booth, T. L., & Thompson, R. A. (1973). Applying Probability Measures to Abstract Languages. IEEE Transactions on Computers, C-22(5), 442-450. doi:10.1109/t-c.1973.223746Thompson, R. A. (1974). Determination of Probabilistic Grammars for Functionally Specified Probability-Measure Languages. IEEE Transactions on Computers, C-23(6), 603-614. doi:10.1109/t-c.1974.224001Wetherell, C. S. (1980). Probabilistic Languages: A Review and Some Open Questions. ACM Computing Surveys, 12(4), 361-379. doi:10.1145/356827.356829Sanchez, J.-A., & Benedi, J.-M. (1997). Consistency of stochastic context-free grammars from probabilistic estimation based on growth transformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(9), 1052-1055. doi:10.1109/34.615455Hutchins, S. E. (1972). Moments of string and derivation lengths of stochastic context-free grammars. Information Sciences, 4(2), 179-191. doi:10.1016/0020-0255(72)90011-4Heim, A., Sidorenko, V., & Sorger, U. (2008). Computation of distributions and their moments in the trellis. Advances in Mathematics of Communications, 2(4), 373-391. doi:10.3934/amc.2008.2.373Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., & Carrasco, R. C. (2005). Probabilistic finite-state machines - part I. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(7), 1013-1025. doi:10.1109/tpami.2005.147Sánchez, J. A., Rocha, M. A., Romero, V., & Villegas, M. (2018). On the Derivational Entropy of Left-to-Right Probabilistic Finite-State Automata and Hidden Markov Models. Computational Linguistics, 44(1), 17-37. doi:10.1162/coli_a_0030
Consistency of Probabilistic Context-Free Grammars
We present an algorithm for deciding whether an arbitrary proper probabilistic context-free grammar is consistent, i.e., whether the probability that a derivation terminates is one. Our procedure has time complexity in the unit-cost model of computation. Moreover, we develop a novel characterization of consistent probabilistic context-free grammars. A simple corollary of our result is that training methods for probabilistic context-free grammars that are based on maximum-likelihood estimation always yield consistent grammars
A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars
The paper gives a brief review of the expectation-maximization algorithm
(Dempster 1977) in the comprehensible framework of discrete mathematics. In
Section 2, two prominent estimation methods, the relative-frequency estimation
and the maximum-likelihood estimation are presented. Section 3 is dedicated to
the expectation-maximization algorithm and a simpler variant, the generalized
expectation-maximization algorithm. In Section 4, two loaded dice are rolled. A
more interesting example is presented in Section 5: The estimation of
probabilistic context-free grammars.Comment: Presented at the 15th European Summer School in Logic, Language and
Information (ESSLLI 2003). Example 5 extended (and partially corrected
Probabilistic parsing strategies.
Abstract. We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counterparts. Such parsing strategies are seen as constructions of pushdown devices from grammars. We show that preservation of probability distribution is possible under two conditions, viz. the correct-prefix property and the property of strong predictiveness. These results generalize existing results in the literature that were obtained by considering parsing strategies in isolation. From our general results, we also derive negative results on so-called generalized LR parsing
Darstellung und stochastische Auflösung von Ambiguität in constraint-basiertem Parsing
Diese Arbeit untersucht zwei komplementäre Ansätze zum Umgang mit Mehrdeutigkeiten bei der automatischen Verarbeitung natürlicher Sprache. Zunächst werden Methoden vorgestellt, die es erlauben, viele konkurrierende Interpretationen in einer gemeinsamen Datenstruktur kompakt zu repräsentieren. Dann werden Ansätze vorgeschlagen, die verschiedenen Interpretationen mit Hilfe von stochastischen Modellen zu bewerten. Für das dabei auftretende Problem, Wahrscheinlichkeiten von seltenen Ereignissen zu schätzen, die in den Trainingsdaten nicht auftraten, werden neuartige Methoden vorgeschlagen.This thesis investigates two complementary approches to cope with ambiguities in natural language processing. It first presents methods that allow to store many competing interpretations compactly in one shared datastructure. It then suggests approaches to score the different interpretations using stochastic models. This leads to the problem of estimation of probabilities of rare events that have not been observed in the training data, for which novel methods are proposed
Learning Efficient Disambiguation
This dissertation analyses the computational properties of current
performance-models of natural language parsing, in particular Data Oriented
Parsing (DOP), points out some of their major shortcomings and suggests
suitable solutions. It provides proofs that various problems of probabilistic
disambiguation are NP-Complete under instances of these performance-models, and
it argues that none of these models accounts for attractive efficiency
properties of human language processing in limited domains, e.g. that frequent
inputs are usually processed faster than infrequent ones. The central
hypothesis of this dissertation is that these shortcomings can be eliminated by
specializing the performance-models to the limited domains. The dissertation
addresses "grammar and model specialization" and presents a new framework, the
Ambiguity-Reduction Specialization (ARS) framework, that formulates the
necessary and sufficient conditions for successful specialization. The
framework is instantiated into specialization algorithms and applied to
specializing DOP. Novelties of these learning algorithms are 1) they limit the
hypotheses-space to include only "safe" models, 2) are expressed as constrained
optimization formulae that minimize the entropy of the training tree-bank given
the specialized grammar, under the constraint that the size of the specialized
model does not exceed a predefined maximum, and 3) they enable integrating the
specialized model with the original one in a complementary manner. The
dissertation provides experiments with initial implementations and compares the
resulting Specialized DOP (SDOP) models to the original DOP models with
encouraging results.Comment: 222 page
Visualization, Adaptation, and Transformation of Procedural Grammars
Procedural shape grammars are powerful tools for the automatic generation of highly detailed 3D content from a set of descriptive rules. It is easy to encode variations in stochastic and parametric grammars, and an uncountable number of models can be generated quickly. While shape grammars offer these advantages over manual 3D modeling, they also suffer from certain drawbacks. We present three novel methods that address some of the limitations of shape grammars. First, it is often difficult to grasp the diversity of models defined by a given grammar. We propose a pipeline to automatically generate, cluster, and select a set of representative preview images for a grammar. The system is based on a new view attribute descriptor that measures how suitable an image is in representing a model and that enables the comparison of different models derived from the same grammar. Second, the default distribution of models in a stochastic grammar is often undesirable. We introduce a framework that allows users to design a new probability distribution for a grammar without editing the rules. Gaussian process regression interpolates user preferences from a set of scored models over an entire shape space. A symbol split operation enables the adaptation of the grammar to generate models according to the learned distribution. Third, it is hard to combine elements of two grammars to emerge new designs. We present design transformations and grammar co-derivation to create new designs from existing ones. Algorithms for fine-grained rule merging can generate a large space of design variations and can be used to create animated transformation sequences between different procedural designs. Our contributions to visualize, adapt, and transform grammars makes the procedural modeling methodology more accessible to non-programmers
- …