Search CORE

33 research outputs found

Structured Prediction of Sequences and Trees using Infinite Contexts

Author: C Zhu
F Wood
M Johnson
MP Marcus
T Cohn
Publication venue
Publication date: 09/03/2015
Field of study

Linguistic structures exhibit a rich array of global phenomena, however commonly used Markov models are unable to adequately describe these phenomena due to their strong locality assumptions. We propose a novel hierarchical model for structured prediction over sequences and trees which exploits global context by conditioning each generation decision on an unbounded context of prior decisions. This builds on the success of Markov models but without imposing a fixed bound in order to better represent global phenomena. To facilitate learning of this large and unbounded model, we use a hierarchical Pitman-Yor process prior which provides a recursive form of smoothing. We propose prediction algorithms based on A* and Markov Chain Monte Carlo sampling. Empirical results demonstrate the potential of our model compared to baseline finite-context Markov models on part-of-speech tagging and syntactic parsing

arXiv.org e-Print Archive

Crossref

Comparing Probabilistic Models for Melodic Sequences

Author: D. Eck
D. Ron
D.H. Ackley
F. Lerdahl
F. Wood
G.E. Hinton
G.E. Hinton
G.W. Taylor
G.W. Taylor
H. Lee
H. Lee
I. Sutskever
M. Norouzi
S. Dubnov
V. Lavrenko
Publication venue
Publication date: 01/01/2011
Field of study

Modelling the real world complexity of music is a challenge for machine learning. We address the task of modeling melodic sequences from the same music genre. We perform a comparative analysis of two probabilistic models; a Dirichlet Variable Length Markov Model (Dirichlet-VMM) and a Time Convolutional Restricted Boltzmann Machine (TC-RBM). We show that the TC-RBM learns descriptive music features, such as underlying chords and typical melody transitions and dynamics. We assess the models for future prediction and compare their performance to a VMM, which is the current state of the art in melody generation. We show that both models perform significantly better than the VMM, with the Dirichlet-VMM marginally outperforming the TC-RBM. Finally, we evaluate the short order statistics of the models, using the Kullback-Leibler divergence between test sequences and model samples, and show that our proposed methods match the statistics of the music genre significantly better than the VMM.Comment: in Proceedings of the ECML-PKDD 2011. Lecture Notes in Computer Science, vol. 6913, pp. 289-304. Springer (2011

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer

Language Modeling with Power Low Rank Ensembles

Author: Dyer Chris
Parikh Ankur P.
Saluja Avneesh
Xing Eric P.
Publication venue
Publication date: 01/01/2014
Field of study

We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context. Our method can be understood as a generalization of n-gram modeling to non-integer n, and includes standard techniques such as absolute discounting and Kneser-Ney smoothing as special cases. PLRE training is efficient and our approach outperforms state-of-the-art modified Kneser Ney baselines in terms of perplexity on large corpora as well as on BLEU score in a downstream machine translation task

arXiv.org e-Print Archive

CiteSeerX

Crossref

Reified Context Models

Author: Liang Percy
Steinhardt Jacob
Publication venue
Publication date: 23/02/2015
Field of study

A classic tension exists between exact inference in a simple model and approximate inference in a complex model. The latter offers expressivity and thus accuracy, but the former provides coverage of the space, an important property for confidence estimation and learning with indirect supervision. In this work, we introduce a new approach, reified context models, to reconcile this tension. Specifically, we let the amount of context (the arity of the factors in a graphical model) be chosen "at run-time" by reifying it---that is, letting this choice itself be a random variable inside the model. Empirically, we show that our approach obtains expressivity and coverage on three natural language tasks

arXiv.org e-Print Archive

CiteSeerX

Neural probabilistic language model for system combination

Author: Okita Tsuyoshi
Publication venue
Publication date: 01/01/2012
Field of study

This paper gives the system description of the neural probabilistic language modeling (NPLM) team of Dublin City University for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12). We used the information obtained by NPLM as meta information to the system combination module. For the Spanish-English data, our paraphrasing approach achieved 25.81 BLEU points, which lost 0.19 BLEU points absolute compared to the standard confusion network-based system combination. We note that our current usage of NPLM is very limited due to the difficulty in combining NPLM and system combination

CiteSeerX

Irish Universities

DCU Online Research Access Service