Search CORE

435 research outputs found

Estimating Compact Yet Rich Tree Insertion Grammars

Author: Shieber Stuart M.
Yamangil Elif
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 20/11/2013
Field of study

We present a Bayesian nonparametric model for estimating tree insertion grammars (TIG), building upon recent work in Bayesian inference of tree substitution grammars (TSG) via Dirichlet processes. Under our general variant of TIG, grammars are estimated via the Metropolis-Hastings algorithm that uses a context free grammar transformation as a proposal, which allows for cubic-time string parsing as well as tree-wide joint sampling of derivations in the spirit of Cohn and Blunsom (2010). We use the Penn treebank for our experiments and find that our proposal Bayesian TIG model not only has competitive parsing performance but also finds compact yet linguistically rich TIG representations of the data.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Nonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars

Author: Shieber Stuart M.
Yamangil Elif
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 20/11/2013
Field of study

In the line of research extending statistical parsing to more expressive grammar formalisms, we demonstrate for the first time the use of tree-adjoining grammars (TAG). We present a Bayesian nonparametric model for estimating a probabilistic TAG from a parsed corpus, along with novel block sampling methods and approximation transformations for TAG that allow efficient parsing. Our work shows performance improvements on the Penn Treebank and finds more compact yet linguistically rich representations of the data, but more importantly provides techniques in grammar transformation and statistical inference that make practical the use of these more expressive systems, thereby enabling further experimentation along these lines.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Recommended from our members

A Context Free TAG Variant

Author: Charniak Eugene
Shieber Stuart M.
Swanson Ben
Yamangil Elif
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 26/07/2013
Field of study

Engineering and Applied Science

Harvard University - DASH

Rich Linguistic Structure from Large-Scale Web Data

Author: Yamangil Elif
Publication venue: 'Harvard University Botany Libraries'
Publication date: 18/10/2013
Field of study

The past two decades have shown an unexpected effectiveness of Web-scale data in natural language processing. Even the simplest models, when paired with unprecedented amounts of unstructured and unlabeled Web data, have been shown to outperform sophisticated ones. It has been argued that the effectiveness of Web-scale data has undermined the necessity of sophisticated modeling or laborious data set curation. In this thesis, we argue for and illustrate an alternative view, that Web-scale data not only serves to improve the performance of simple models, but also can allow the use of qualitatively more sophisticated models that would not be deployable otherwise, leading to even further performance gains.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Logical Hidden Markov Models

Author: De Raedt L.
Kersting K.
Raiko T.
Publication venue: 'AI Access Foundation'
Publication date: 31/12/2010
Field of study

Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evaluation, most likely hidden state sequence and parameter estimation. The resulting representation and algorithms are experimentally evaluated on problems from the domain of bioinformatics

arXiv.org e-Print Archive

CiteSeerX

Crossref

Representing Conversations for Scalable Overhearing

Author: Gutnik G.
Kaminka G. A.
Publication venue: 'AI Access Foundation'
Publication date: 26/09/2011
Field of study

Open distributed multi-agent systems are gaining interest in the academic community and in industry. In such open settings, agents are often coordinated using standardized agent conversation protocols. The representation of such protocols (for analysis, validation, monitoring, etc) is an important aspect of multi-agent applications. Recently, Petri nets have been shown to be an interesting approach to such representation, and radically different approaches using Petri nets have been proposed. However, their relative strengths and weaknesses have not been examined. Moreover, their scalability and suitability for different tasks have not been addressed. This paper addresses both these challenges. First, we analyze existing Petri net representations in terms of their scalability and appropriateness for overhearing, an important task in monitoring open multi-agent systems. Then, building on the insights gained, we introduce a novel representation using Colored Petri nets that explicitly represent legal joint conversation states and messages. This representation approach offers significant improvements in scalability and is particularly suitable for overhearing. Furthermore, we show that this new representation offers a comprehensive coverage of all conversation features of FIPA conversation standards. We also present a procedure for transforming AUML conversation protocol diagrams (a standard human-readable representation), to our Colored Petri net representation

arXiv.org e-Print Archive

Crossref

Acta Cybernetica : Volume 18. Number 4.

Author
Publication venue
Publication date: 01/01/2008
Field of study

University of Szeged

Application of stochastic grammars to understanding action

Author: Ivanov Yuri A., 1967-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1998
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1998.Includes bibliographical references (leaves 69-72).by Yuri A. Ivanov.M.S

CiteSeerX

DSpace@MIT

Efficient Generator of Mathematical Expressions for Symbolic Regression

Author: Džeroski Sašo
Mežnar Sebastian
Todorovski Ljupčo
Publication venue
Publication date: 10/09/2023
Field of study

We propose an approach to symbolic regression based on a novel variational autoencoder for generating hierarchical structures, HVAE. It combines simple atomic units with shared weights to recursively encode and decode the individual nodes in the hierarchy. Encoding is performed bottom-up and decoding top-down. We empirically show that HVAE can be trained efficiently with small corpora of mathematical expressions and can accurately encode expressions into a smooth low-dimensional latent space. The latter can be efficiently explored with various optimization methods to address the task of symbolic regression. Indeed, random search through the latent space of HVAE performs better than random search through expressions generated by manually crafted probabilistic grammars for mathematical expressions. Finally, EDHiE system for symbolic regression, which applies an evolutionary algorithm to the latent space of HVAE, reconstructs equations from a standard symbolic regression benchmark better than a state-of-the-art system based on a similar combination of deep learning and evolutionary algorithms.\v{z}Comment: 35 pages, 11 tables, 7 multi-part figures, Machine learning (Springer) and journal track of ECML/PKDD 202

arXiv.org e-Print Archive

Repository of the University of Ljubljana

On Language Processors and Software Maintenance

Author: Lohmann Wolfgang (gnd: 138536171)
Publication venue: Universität Rostock
Publication date: 01/01/2009
Field of study

This work investigates declarative transformation tools in the context of software maintenance. Besides maintenance of the language specification, evolution of a software language requires the adaptation of the software written in that language as well as the adaptation of the software that transforms software written in the evolving language. This co-evolution is studied to derive automatic adaptations of artefacts from adaptations of the language specification. Furthermore, AOP for Prolog is introduced to improve maintainability of language specifications and derived tools.Die Arbeit unterstützt deklarative Transformationswerkzeuge im Kontext der Softwarewartung. Neben der Wartung der Sprachbeschreibung erfordert die Evolution einer Sprache sowohl die Anpassung der Software, die in dieser Sprache geschrieben ist als auch die Anpassung der Software, die diese Software transformiert. Diese Koevolution wird untersucht, um automatische Anpassungen von Artefakten von Anpassungen der Sprachbeschreibungen abzuleiten. Weiterhin wird AOP für Prolog eingeführt, um die Wartbarkeit von Sprachbeschreibungen und den daraus abgeleiteten Werkzeugen zu erhöhen

Rostocker Dokumentenserver