2,106,221 research outputs found
Ridge Fusion in Statistical Learning
We propose a penalized likelihood method to jointly estimate multiple
precision matrices for use in quadratic discriminant analysis and model based
clustering. A ridge penalty and a ridge fusion penalty are used to introduce
shrinkage and promote similarity between precision matrix estimates. Block-wise
coordinate descent is used for optimization, and validation likelihood is used
for tuning parameter selection. Our method is applied in quadratic discriminant
analysis and semi-supervised model based clustering.Comment: 24 pages and 9 tables, 3 figure
Statistical relational learning with soft quantifiers
Quantification in statistical relational learning (SRL) is either existential or universal, however humans might be more inclined to express knowledge using soft quantifiers, such as ``most'' and ``a few''. In this paper, we define the syntax and semantics of PSL^Q, a new SRL framework that supports reasoning with soft quantifiers, and present its most probable explanation (MPE) inference algorithm. To the best of our knowledge, PSL^Q is the first SRL framework that combines soft quantifiers with first-order logic rules for modelling uncertain relational data. Our experimental results for link prediction in social trust networks demonstrate that the use of soft quantifiers not only allows for a natural and intuitive formulation of domain knowledge, but also improves the accuracy of inferred results
Learning Arbitrary Statistical Mixtures of Discrete Distributions
We study the problem of learning from unlabeled samples very general
statistical mixture models on large finite sets. Specifically, the model to be
learned, , is a probability distribution over probability
distributions , where each such is a probability distribution over . When we sample from , we do not observe
directly, but only indirectly and in very noisy fashion, by sampling from
repeatedly, independently times from the distribution . The problem is
to infer to high accuracy in transportation (earthmover) distance.
We give the first efficient algorithms for learning this mixture model
without making any restricting assumptions on the structure of the distribution
. We bound the quality of the solution as a function of the size of
the samples and the number of samples used. Our model and results have
applications to a variety of unsupervised learning scenarios, including
learning topic models and collaborative filtering.Comment: 23 pages. Preliminary version in the Proceeding of the 47th ACM
Symposium on the Theory of Computing (STOC15
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
Statistical Mechanics of Time Domain Ensemble Learning
Conventional ensemble learning combines students in the space domain. On the
other hand, in this paper we combine students in the time domain and call it
time domain ensemble learning. In this paper, we analyze the generalization
performance of time domain ensemble learning in the framework of online
learning using a statistical mechanical method. We treat a model in which both
the teacher and the student are linear perceptrons with noises. Time domain
ensemble learning is twice as effective as conventional space domain ensemble
learning.Comment: 10 pages, 10 figure
Fast rates in statistical and online learning
The speed with which a learning algorithm converges as it is presented with
more data is a central problem in machine learning --- a fast rate of
convergence means less data is needed for the same level of performance. The
pursuit of fast rates in online and statistical learning has led to the
discovery of many conditions in learning theory under which fast learning is
possible. We show that most of these conditions are special cases of a single,
unifying condition, that comes in two forms: the central condition for 'proper'
learning algorithms that always output a hypothesis in the given model, and
stochastic mixability for online algorithms that may make predictions outside
of the model. We show that under surprisingly weak assumptions both conditions
are, in a certain sense, equivalent. The central condition has a
re-interpretation in terms of convexity of a set of pseudoprobabilities,
linking it to density estimation under misspecification. For bounded losses, we
show how the central condition enables a direct proof of fast rates and we
prove its equivalence to the Bernstein condition, itself a generalization of
the Tsybakov margin condition, both of which have played a central role in
obtaining fast rates in statistical learning. Yet, while the Bernstein
condition is two-sided, the central condition is one-sided, making it more
suitable to deal with unbounded losses. In its stochastic mixability form, our
condition generalizes both a stochastic exp-concavity condition identified by
Juditsky, Rigollet and Tsybakov and Vovk's notion of mixability. Our unifying
conditions thus provide a substantial step towards a characterization of fast
rates in statistical learning, similar to how classical mixability
characterizes constant regret in the sequential prediction with expert advice
setting.Comment: 69 pages, 3 figure
- …
