Search CORE

12,361 research outputs found

Recommended from our members

Learning mixtures of product distributions over discrete domains

Author: Feldman Jon
O'Donnell Ryan
Servedio Rocco Anthony
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2005
Field of study

We consider the problem of learning mixtures of product distributions over discrete domains in the distribution learning framework introduced by Kearns et al. [18]. We give a poly(n/ε) time algorithm for learning a mixture of k arbitrary product distributions over the n-dimensional Boolean cube {0,1}n to accuracy ε, for any constant k. Previous polynomial time algorithms could only achieve this for k = 2 product distributions; our result answers an open question stated independently in [8] and [14]. We further give evidence that no polynomial time algorithm can succeed when k is superconstant, by reduction from a notorious open problem in PAC learning. Finally, we generalize our poly(n/ε) time algorithm to learn any mixture of k = O(1) product distributions over {0, 1, . . . , b}n, for any b = O(1)

Columbia University Academic Commons

Learning mixtures of structured distributions over discrete domains

Author: Chan Siu-on
Diakonikolas Ilias
Servedio Rocco A.
Sun Xiaorui
Publication venue
Publication date: 02/10/2012
Field of study

Let

\mathfrak{C}

be a class of probability distributions over the discrete domain

[n] = \{1,...,n\}.

We show that if

\mathfrak{C}

satisfies a rather general condition -- essentially, that each distribution in

\mathfrak{C}

can be well-approximated by a variable-width histogram with few bins -- then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of

k

unknown distributions from

\mathfrak{C}.

We analyze several natural types of distributions over

[n]

, including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems.Comment: preliminary full version of soda'13 pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

Bayesian Learning of Sum-Product Networks

Author: Ge Hong
Ghahramani Zoubin
Peharz Robert
Pernkopf Franz
Trapp Martin
Publication venue
Publication date: 26/05/2019
Field of study

Sum-product networks (SPNs) are flexible density estimators and have received significant attention due to their attractive inference properties. While parameter learning in SPNs is well developed, structure learning leaves something to be desired: Even though there is a plethora of SPN structure learners, most of them are somewhat ad-hoc and based on intuition rather than a clear learning principle. In this paper, we introduce a well-principled Bayesian framework for SPN structure learning. First, we decompose the problem into i) laying out a computational graph, and ii) learning the so-called scope function over the graph. The first is rather unproblematic and akin to neural network architecture validation. The second represents the effective structure of the SPN and needs to respect the usual structural constraints in SPN, i.e. completeness and decomposability. While representing and learning the scope function is somewhat involved in general, in this paper, we propose a natural parametrisation for an important and widely used special case of SPNs. These structural parameters are incorporated into a Bayesian model, such that simultaneous structure and parameter learning is cast into monolithic Bayesian posterior inference. In various experiments, our Bayesian SPNs often improve test likelihoods over greedy SPN learners. Further, since the Bayesian framework protects against overfitting, we can evaluate hyper-parameters directly on the Bayesian model score, waiving the need for a separate validation set, which is especially beneficial in low data regimes. Bayesian SPNs can be applied to heterogeneous domains and can easily be extended to nonparametric formulations. Moreover, our Bayesian approach is the first, which consistently and robustly learns SPN structures under missing data.Comment: NeurIPS 2019; See conference page for supplemen

arXiv.org e-Print Archive

Pure OAI Repository