26,793 research outputs found
Learning factor graphs in polynomial time and sample complexity
We study the computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded degree can be learned in polynomial time and from a polynomial number of training examples, assuming that the data is generated by a network in this class. This result covers both parameter estimation for a known network structure and structure learning. It implies as a corollary that we can learn factor graphs for both Bayesian networks and Markov networks of bounded degree, in polynomial time and sample complexity. Importantly, unlike standard maximum likelihood estimation algorithms, our method does not require inference in the underlying network, and so applies to networks where inference is intractable. We also show that the error of our learned model degrades gracefully when the generating distribution is not a member of the target class of networks. In addition to our main result, we show that the sample complexity of parameter learning in graphical models has an O(1) dependence on the number of variables in the model when using the KL-divergence normalized by the number of variables as the performance criterion
Learning Graphical Models Using Multiplicative Weights
We give a simple, multiplicative-weight update algorithm for learning
undirected graphical models or Markov random fields (MRFs). The approach is
new, and for the well-studied case of Ising models or Boltzmann machines, we
obtain an algorithm that uses a nearly optimal number of samples and has
quadratic running time (up to logarithmic factors), subsuming and improving on
all prior work. Additionally, we give the first efficient algorithm for
learning Ising models over general alphabets.
Our main application is an algorithm for learning the structure of t-wise
MRFs with nearly-optimal sample complexity (up to polynomial losses in
necessary terms that depend on the weights) and running time that is
. In addition, given samples, we can also learn the
parameters of the model and generate a hypothesis that is close in statistical
distance to the true MRF. All prior work runs in time for
graphs of bounded degree d and does not generate a hypothesis close in
statistical distance even for t=3. We observe that our runtime has the correct
dependence on n and t assuming the hardness of learning sparse parities with
noise.
Our algorithm--the Sparsitron-- is easy to implement (has only one parameter)
and holds in the on-line setting. Its analysis applies a regret bound from
Freund and Schapire's classic Hedge algorithm. It also gives the first solution
to the problem of learning sparse Generalized Linear Models (GLMs)
Sign rank versus VC dimension
This work studies the maximum possible sign rank of sign
matrices with a given VC dimension . For , this maximum is {three}. For
, this maximum is . For , similar but
slightly less accurate statements hold. {The lower bounds improve over previous
ones by Ben-David et al., and the upper bounds are novel.}
The lower bounds are obtained by probabilistic constructions, using a theorem
of Warren in real algebraic topology. The upper bounds are obtained using a
result of Welzl about spanning trees with low stabbing number, and using the
moment curve.
The upper bound technique is also used to: (i) provide estimates on the
number of classes of a given VC dimension, and the number of maximum classes of
a given VC dimension -- answering a question of Frankl from '89, and (ii)
design an efficient algorithm that provides an multiplicative
approximation for the sign rank.
We also observe a general connection between sign rank and spectral gaps
which is based on Forster's argument. Consider the adjacency
matrix of a regular graph with a second eigenvalue of absolute value
and . We show that the sign rank of the signed
version of this matrix is at least . We use this connection to
prove the existence of a maximum class with VC
dimension and sign rank . This answers a question
of Ben-David et al.~regarding the sign rank of large VC classes. We also
describe limitations of this approach, in the spirit of the Alon-Boppana
theorem.
We further describe connections to communication complexity, geometry,
learning theory, and combinatorics.Comment: 33 pages. This is a revised version of the paper "Sign rank versus VC
dimension". Additional results in this version: (i) Estimates on the number
of maximum VC classes (answering a question of Frankl from '89). (ii)
Estimates on the sign rank of large VC classes (answering a question of
Ben-David et al. from '03). (iii) A discussion on the computational
complexity of computing the sign-ran
- …