Search CORE

1,480 research outputs found

A convex pseudo-likelihood framework for high dimensional partial correlation estimation with convergence guarantees

Author: Khare Kshitij
Oh Sang-Yun
Rajaratnam Bala
Publication venue
Publication date: 14/08/2014
Field of study

Sparse high dimensional graphical model selection is a topic of much interest in modern day statistics. A popular approach is to apply l1-penalties to either (1) parametric likelihoods, or, (2) regularized regression/pseudo-likelihoods, with the latter having the distinct advantage that they do not explicitly assume Gaussianity. As none of the popular methods proposed for solving pseudo-likelihood based objective functions have provable convergence guarantees, it is not clear if corresponding estimators exist or are even computable, or if they actually yield correct partial correlation graphs. This paper proposes a new pseudo-likelihood based graphical model selection method that aims to overcome some of the shortcomings of current methods, but at the same time retain all their respective strengths. In particular, we introduce a novel framework that leads to a convex formulation of the partial covariance regression graph problem, resulting in an objective function comprised of quadratic forms. The objective is then optimized via a coordinate-wise approach. The specific functional form of the objective function facilitates rigorous convergence analysis leading to convergence guarantees; an important property that cannot be established using standard results, when the dimension is larger than the sample size, as is often the case in high dimensional applications. These convergence guarantees ensure that estimators are well-defined under very general conditions, and are always computable. In addition, the approach yields estimators that have good large sample properties and also respect symmetry. Furthermore, application to simulated/real data, timing comparisons and numerical convergence is demonstrated. We also present a novel unifying framework that places all graphical pseudo-likelihood methods as special cases of a more general formulation, leading to important insights

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Generalized Pseudolikelihood Methods for Inverse Covariance Estimation

Author: Ali Alnur
Khare Kshitij
Oh Sang-Yun
Rajaratnam Bala
Publication venue
Publication date: 14/10/2016
Field of study

We introduce PseudoNet, a new pseudolikelihood-based estimator of the inverse covariance matrix, that has a number of useful statistical and computational properties. We show, through detailed experiments with synthetic and also real-world finance as well as wind power data, that PseudoNet outperforms related methods in terms of estimation error and support recovery, making it well-suited for use in a downstream application, where obtaining low estimation error can be important. We also show, under regularity conditions, that PseudoNet is consistent. Our proof assumes the existence of accurate estimates of the diagonal entries of the underlying inverse covariance matrix; we additionally provide a two-step method to obtain these estimates, even in a high-dimensional setting, going beyond the proofs for related methods. Unlike other pseudolikelihood-based methods, we also show that PseudoNet does not saturate, i.e., in high dimensions, there is no hard limit on the number of nonzero entries in the PseudoNet estimate. We present a fast algorithm as well as screening rules that make computing the PseudoNet estimate over a range of tuning parameters tractable

arXiv.org e-Print Archive

eScholarship - University of California

Learning Gaussian Graphical Models with Latent Confounders

Author: Franks Alexander
Oh Sang-Yun
Wang Ke
Publication venue
Publication date: 23/07/2023
Field of study

Gaussian Graphical models (GGM) are widely used to estimate the network structures in many applications ranging from biology to finance. In practice, data is often corrupted by latent confounders which biases inference of the underlying true graphical structure. In this paper, we compare and contrast two strategies for inference in graphical models with latent confounders: Gaussian graphical models with latent variables (LVGGM) and PCA-based removal of confounding (PCA+GGM). While these two approaches have similar goals, they are motivated by different assumptions about confounding. In this paper, we explore the connection between these two approaches and propose a new method, which combines the strengths of these two approaches. We prove the consistency and convergence rate for the PCA-based method and use these results to provide guidance about when to use each method. We demonstrate the effectiveness of our methodology using both simulations and in two real-world applications

arXiv.org e-Print Archive

Partial Separability and Functional Graphical Models for Multivariate Gaussian Processes

Author: Oh Sang-Yun
Petersen Alexander
Zapata Javier
Publication venue
Publication date: 23/10/2019
Field of study

The covariance structure of multivariate functional data can be highly complex, especially if the multivariate dimension is large, making extension of statistical methods for standard multivariate data to the functional data setting quite challenging. For example, Gaussian graphical models have recently been extended to the setting of multivariate functional data by applying multivariate methods to the coefficients of truncated basis expansions. However, a key difficulty compared to multivariate data is that the covariance operator is compact, and thus not invertible. The methodology in this paper addresses the general problem of covariance modeling for multivariate functional data, and functional Gaussian graphical models in particular. As a first step, a new notion of separability for multivariate functional data is proposed, termed partial separability, leading to a novel Karhunen-Lo\`eve-type expansion for such data. Next, the partial separability structure is shown to be particularly useful in order to provide a well-defined Gaussian graphical model that can be identified with a sequence of finite-dimensional graphical models, each of fixed dimension. This motivates a simple and efficient estimation procedure through application of the joint graphical lasso. Empirical performance of the method for graphical model estimation is assessed through simulation and analysis of functional brain connectivity during a motor task.Comment: 39 pages, 5 figure

arXiv.org e-Print Archive

eScholarship - University of California

Revealing Fundamental Physics from the Daya Bay Neutrino Experiment using Deep Neural Networks

Author: Baldi Pierre
Bhimji Wahid
Ko Seyoon
Oh Sang-Yun
Prabhat
Racah Evan
Sadowski Peter
Tull Craig
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Experiments in particle physics produce enormous quantities of data that must be analyzed and interpreted by teams of physicists. This analysis is often exploratory, where scientists are unable to enumerate the possible types of signal prior to performing the experiment. Thus, tools for summarizing, clustering, visualizing and classifying high-dimensional data are essential. In this work, we show that meaningful physical content can be revealed by transforming the raw data into a learned high-level representation using deep neural networks, with measurements taken at the Daya Bay Neutrino Experiment as a case study. We further show how convolutional deep neural networks can provide an effective classification filter with greater than 97% accuracy across different classes of physics events, significantly better than other machine learning approaches

arXiv.org e-Print Archive

eScholarship - University of California

The Economics of All-You-Can-Read Pricing: Tariff Choice, Contract Renewal, and Switching for E-Book Purchases

Author: Han Sang Pil
Hong Jinpyo
Moon Jae Yun
Oh Wonseok
Publication venue: AIS Electronic Library (AISeL)
Publication date: 13/12/2015
Field of study

E-book markets are currently moving through a period of disequilibrium as new pricing structures (i.e., flat-fee subscriptions) are rapidly embraced by major vendors. On the basis of a novel dataset, we investigate how the availability of “all-you-can-read” pricing programs influences consumers’ tariff choice, contract renewal, and switching behaviors. Consistent with the rational choice framework, the findings suggest that most e-book consumers significantly gain from subscription-based tariffs. However, we also find some other intriguing results. Among the three subscription designs examined, the 1-week plan affords consumers more economic benefits than do 1-day or 1-month programs. The economic gains derived from subscription-based tariffs diminish as consumers renew their subscriptions under the same contract duration. Consumers who switch to other plans also suffer from reduced savings. Finally, iOS users are more inclined to select subscription models than are Android users because of the absence of in-app purchase functionalities for the former

AIS Electronic Library (AISeL)