13 research outputs found
Structure Learning of Partitioned Markov Networks
We learn the structure of a Markov Network between two groups of random
variables from joint observations. Since modelling and learning the full MN
structure may be hard, learning the links between two groups directly may be a
preferable option. We introduce a novel concept called the \emph{partitioned
ratio} whose factorization directly associates with the Markovian properties of
random variables across two groups. A simple one-shot convex optimization
procedure is proposed for learning the \emph{sparse} factorizations of the
partitioned ratio and it is theoretically guaranteed to recover the correct
inter-group structure under mild conditions. The performance of the proposed
method is experimentally compared with the state of the art MN structure
learning methods using ROC curves. Real applications on analyzing
bipartisanship in US congress and pairwise DNA/time-series alignments are also
reported.Comment: Camera Ready for ICML 2016. Fixed some minor typo
Lower Bounds for Two-Sample Structural Change Detection in Ising and Gaussian Models
The change detection problem is to determine if the Markov network structures
of two Markov random fields differ from one another given two sets of samples
drawn from the respective underlying distributions. We study the trade-off
between the sample sizes and the reliability of change detection, measured as a
minimax risk, for the important cases of the Ising models and the Gaussian
Markov random fields restricted to the models which have network structures
with nodes and degree at most , and obtain information-theoretic lower
bounds for reliable change detection over these models. We show that for the
Ising model, samples are
required from each dataset to detect even the sparsest possible changes, and
that for the Gaussian, samples are
required from each dataset to detect change, where is the smallest
ratio of off-diagonal to diagonal terms in the precision matrices of the
distributions. These bounds are compared to the corresponding results in
structure learning, and closely match them under mild conditions on the model
parameters. Thus, our change detection bounds inherit partial tightness from
the structure learning schemes in previous literature, demonstrating that in
certain parameter regimes, the naive structure learning based approach to
change detection is minimax optimal up to constant factors.Comment: Presented at the 55th Annual Allerton Conference on Communication,
Control, and Computing, Oct. 201
Trimmed Density Ratio Estimation
Density ratio estimation is a vital tool in both machine learning and
statistical community. However, due to the unbounded nature of density ratio,
the estimation procedure can be vulnerable to corrupted data points, which
often pushes the estimated ratio toward infinity. In this paper, we present a
robust estimator which automatically identifies and trims outliers. The
proposed estimator has a convex formulation, and the global optimum can be
obtained via subgradient descent. We analyze the parameter estimation error of
this estimator under high-dimensional settings. Experiments are conducted to
verify the effectiveness of the estimator.Comment: Made minor revisions. Restructured the introductory section
Consistent change-point detection with kernels
International audienceIn this paper we study the kernel change-point algorithm (KCP) proposed by Arlot, Celisse and Harchaoui (2012), which aims at locating an unknown number of change-points in the distribution of a sequence of independent data taking values in an arbitrary set. The change-points are selected by model selection with a penalized kernel empirical criterion. We provide a non-asymptotic result showing that, with high probability, the KCP procedure retrieves the correct number of change-points, provided that the constant in the penalty is well-chosen; in addition, KCP estimates the change-points location at the optimal rate. As a consequence, when using a characteristic kernel, KCP detects all kinds of change in the distribution (not only changes in the mean or the variance), and it is able to do so for complex structured data (not necessarily in ). Most of the analysis is conducted assuming that the kernel is bounded; part of the results can be extended when we only assume a finite second-order moment
Probabilistic Structured Models for Plant Trait Analysis
University of Minnesota Ph.D. dissertation. March 2017. Major: Communication Sciences and Disorders. Advisor: Arindam Banerjee. 1 computer file (PDF); xii, 171 pages.Many fields in modern science and engineering such as ecology, computational biology, astronomy, signal processing, climate science, brain imaging, natural language processing, and many more involve collecting data sets in which the dimensionality of the data p exceeds the sample size n. Since it is usually impossible to obtain consistent procedures unless p < n, a line of recent work has studied models with various types of low-dimensional structure, including sparse vectors, sparse structured graphical models, low-rank matrices, and combinations thereof. In such settings, a general approach to estimation is to solve a regularized optimization problem, which combines a loss function measuring how well the model fits the data with some regularization function that encourages the assumed structure. Of particular interest are structure learning of graphical models in high dimensional setting. The majority of statistical analysis of graphical model estimations assume that all the data are fully observed and the data points are sampled from the same distribution and provide the sample complexity and convergence rate by considering only one graphical structure for all the observations. In this thesis, we extend the above results to estimate the structure of graphical models where the data is partially observed or the data is sampled from multiple distributions. First, we consider the problem of estimating change in the dependency structure of two p-dimensional models, based on samples drawn from two graphical models. The change is assumed to be structured, e.g., sparse, block sparse, node-perturbed sparse, etc., such that it can be characterized by a suitable (atomic) norm. We present and analyze a norm-regularized estimator for directly estimating the change in structure, without having to estimate the structures of the individual graphical models. Next, we consider the problem of estimating sparse structure of Gaussian copula distributions (corresponding to non-paranormal distributions) using samples with missing values. We prove that our proposed estimators consistently estimate the non-paranormal correlation matrix where the convergence rate depends on the probability of missing values. In the second part of thesis, we consider matrix completion problem. Low-rank matrix completion methods have been successful in a variety of settings such as recommendation systems. However, most of the existing matrix completion methods only provide a point estimate of missing entries, and do not characterize uncertainties of the predictions. First, we illustrate that the the posterior distribution in latent factor models, such as probabilistic matrix factorization, when marginalized over one latent factor has the Matrix Generalized Inverse Gaussian (MGIG) distribution. We show that the MGIG is unimodal, and the mode can be obtained by solving an Algebraic Riccati Equation equation. The characterization leads to a novel Collapsed Monte Carlo inference algorithm for such latent factor models. Next, we propose a Bayesian hierarchical probabilistic matrix factorization (BHPMF) model to 1) incorporate hierarchical side information, and 2) provide uncertainty quantified predictions. The former yields significant performance improvements in the problem of plant trait prediction, a key problem in ecology, by leveraging the taxonomic hierarchy in the plant kingdom. The latter is helpful in identifying predictions of low confidence which can in turn be used to guide field work for data collection efforts. Finally, we consider applications of probabilistic structured models to plant trait analysis. We apply BHPMF model to fill the gaps in TRY database. The BHPMF model is the-state-of-the-art model for plant trait prediction and is getting increasing visibility and usage in the plant trait analysis. We have submitted a R package for BHPMF to CRAN. Next, we apply the Gaussian graphical model structure estimators to obtain the trait-trait interactions. We study the trait-trait interactions structure at different climate zones and among different plant growth forms and uncover the dependence of traits on climate and on vegetation