Search CORE

48 research outputs found

Asymptotic Analysis of Generative Semi-Supervised Learning

Author: Balasubramanian Krishnakumar
Dillon Joshua V
Lebanon Guy
Publication venue
Publication date: 01/01/2010
Field of study

Semisupervised learning has emerged as a popular framework for improving modeling accuracy while controlling labeling cost. Based on an extension of stochastic composite likelihood we quantify the asymptotic accuracy of generative semi-supervised learning. In doing so, we complement distribution-free analysis by providing an alternative framework to measure the value associated with different labeling policies and resolve the fundamental question of how much data to label and in what manner. We demonstrate our approach with both simulation studies and real world experiments using naive Bayes for text classification and MRFs and CRFs for structured prediction in NLP.Comment: 12 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

Statistical and Computational Tradeoffs in Stochastic Composite Likelihood

Author: Dillon Joshua V
Lebanon Guy
Publication venue
Publication date: 01/01/2009
Field of study

Maximum likelihood estimators are often of limited practical use due to the intensive computation they require. We propose a family of alternative estimators that maximize a stochastic variation of the composite likelihood function. Each of the estimators resolve the computation-accuracy tradeoff differently, and taken together they span a continuous spectrum of computation-accuracy tradeoff resolutions. We prove the consistency of the estimators, provide formulas for their asymptotic variance, statistical robustness, and computational complexity. We discuss experimental results in the context of Boltzmann machines and conditional random fields. The theoretical and experimental studies demonstrate the effectiveness of the estimators when the computational resources are insufficient. They also demonstrate that in some cases reduced computational complexity is associated with robustness thereby increasing statistical accuracy.Comment: 30 pages, 97 figures, 2 author

arXiv.org e-Print Archive

CiteSeerX

Distributed Parameter Estimation via Pseudo-likelihood

Author: Ihler Alexander
Liu Qiang
Publication venue
Publication date: 01/01/2012
Field of study

Estimating statistical models within sensor networks requires distributed algorithms, in which both data and computation are distributed across the nodes of the network. We propose a general approach for distributed learning based on combining local estimators defined by pseudo-likelihood components, encompassing a number of combination methods, and provide both theoretical and experimental analysis. We show that simple linear combination or max-voting methods, when combined with second-order information, are statistically competitive with more advanced and costly joint optimization. Our algorithms have many attractive properties including low communication and computational cost and "any-time" behavior.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

arXiv.org e-Print Archive

CiteSeerX

Distributed Parameter Estimation in Probabilistic Graphical Models

Author: de Freitas Nando
Denil Misha
Mizrahi Yariv Dror
Publication venue
Publication date: 01/01/2014
Field of study

This paper presents foundational theoretical results on distributed parameter estimation for undirected probabilistic graphical models. It introduces a general condition on composite likelihood decompositions of these models which guarantees the global consistency of distributed estimators, provided the local estimators are consistent

arXiv.org e-Print Archive

Oxford University Research Archive