237 research outputs found

    Copula Density Estimation by Total Variation Penalized Likelihood with Linear Equality Constraints

    Get PDF
    A copula density is the joint probability density function (PDF) of a random vector with uniform marginals. An approach to bivariate copula density estimation is introduced that is based on a maximum penalized likelihood estimation (MPLE) with a total variation (TV) penalty term. The marginal unity and symmetry constraints for copula density are enforced by linear equality constraints. The TV-MPLE subject to linear equality constraints is solved by an augmented Lagrangian and operator-splitting algorithm. It offers an order of magnitude improvement in computational efficiency over another TV-MPLE method without constraints solved by log-barrier method for second order cone program. A data-driven selection of the regularization parameter is through K-fold cross-validation (CV). Simulation and real data application show the effectiveness of the proposed approach. The MATLAB code implementing the methodology is available online

    Factorial graphical lasso for dynamic networks

    Full text link
    Dynamic networks models describe a growing number of important scientific processes, from cell biology and epidemiology to sociology and finance. There are many aspects of dynamical networks that require statistical considerations. In this paper we focus on determining network structure. Estimating dynamic networks is a difficult task since the number of components involved in the system is very large. As a result, the number of parameters to be estimated is bigger than the number of observations. However, a characteristic of many networks is that they are sparse. For example, the molecular structure of genes make interactions with other components a highly-structured and therefore sparse process. Penalized Gaussian graphical models have been used to estimate sparse networks. However, the literature has focussed on static networks, which lack specific temporal constraints. We propose a structured Gaussian dynamical graphical model, where structures can consist of specific time dynamics, known presence or absence of links and block equality constraints on the parameters. Thus, the number of parameters to be estimated is reduced and accuracy of the estimates, including the identification of the network, can be tuned up. Here, we show that the constrained optimization problem can be solved by taking advantage of an efficient solver, logdetPPA, developed in convex optimization. Moreover, model selection methods for checking the sensitivity of the inferred networks are described. Finally, synthetic and real data illustrate the proposed methodologies.Comment: 30 pp, 5 figure

    Distributional Regression for Data Analysis

    Full text link
    Flexible modeling of how an entire distribution changes with covariates is an important yet challenging generalization of mean-based regression that has seen growing interest over the past decades in both the statistics and machine learning literature. This review outlines selected state-of-the-art statistical approaches to distributional regression, complemented with alternatives from machine learning. Topics covered include the similarities and differences between these approaches, extensions, properties and limitations, estimation procedures, and the availability of software. In view of the increasing complexity and availability of large-scale data, this review also discusses the scalability of traditional estimation methods, current trends, and open challenges. Illustrations are provided using data on childhood malnutrition in Nigeria and Australian electricity prices.Comment: Accepted for publication in Annual Review of Statistics and its Applicatio

    Generalized Additive Modeling For Multivariate Distributions

    Get PDF
    In this thesis, we develop tools to study the influence of predictors on multivariate distributions. We tackle the issue of conditional dependence modeling using generalized additive models, a natural extension of linear and generalized linear models allowing for smooth functions of the covariates. Compared to existing methods, the framework that we develop has two main advantages. First, it is completely flexible, in the sense that the dependence structure can vary with an arbitrary set of covariates in a parametric, nonparametric or semiparametric way. Second, it is both quick and numerically stable, which means that it is suitable for exploratory data analysis and stepwise model building. Starting from the bivariate case, we extend our framework to pair-copula constructions, and open new possibilities for further applied and methodological work. Our regression-like theory of the dependence, being built on conditional copulas and generalized additive models, is at the same time theoretically sound and practically useful

    Bivariate copula additive models for location, scale and shape

    Get PDF
    In generalized additive models for location, scale and shape (GAMLSS), the response distribution is not restricted to belong to the exponential family and all the model’s parameters can be made dependent on additive predictors that allow for several types of covariate effects (such as linear, non-linear, random and spatial effects). In many empirical situations, however, modeling simultaneously two or more responses conditional on some covariates can be of considerable relevance. The scope of GAMLSS is extended by introducing bivariate copula models with continuous margins for the GAMLSS class. The proposed computational tool permits the copula dependence and marginal distribution parameters to be estimated simultaneously, and each parameter to be modeled using an additive predictor. Simultaneous parameter estimation is achieved within a penalized likelihood framework using a trust region algorithm with integrated automatic multiple smoothing parameter selection. The introduced approach allows for straightforward inclusion of potentially any parametric marginal distribution and copula function. The models can be easily used via the copulaReg() function in the R package SemiParBIVProbit. The proposal is illustrated through two case studies and simulated data

    On Graphical Models via Univariate Exponential Family Distributions

    Full text link
    Undirected graphical models, or Markov networks, are a popular class of statistical models, used in a wide variety of applications. Popular instances of this class include Gaussian graphical models and Ising models. In many settings, however, it might not be clear which subclass of graphical models to use, particularly for non-Gaussian and non-categorical data. In this paper, we consider a general sub-class of graphical models where the node-wise conditional distributions arise from exponential families. This allows us to derive multivariate graphical model distributions from univariate exponential family distributions, such as the Poisson, negative binomial, and exponential distributions. Our key contributions include a class of M-estimators to fit these graphical model distributions; and rigorous statistical analysis showing that these M-estimators recover the true graphical model structure exactly, with high probability. We provide examples of genomic and proteomic networks learned via instances of our class of graphical models derived from Poisson and exponential distributions.Comment: Journal of Machine Learning Researc

    Dependence Modelling and Testing: Copula and Varying Coefficient Model with Missing Data

    Get PDF
    This thesis investigates three topics in theoretical econometrics: goodness-of-fit tests for copulas, copula density estimators which preserve the copula property, and bias-correction for the naive kernel local linear estimators in the two-sample varying coefficient model with missing data. In the first topic a family of goodness-of-fit tests for copulas is proposed. The tests use generalizations of the information matrix equality of White (1982). The asymptotic distribution of the generalized tests is derived. In Monte Carlo simulations, the behavior of the new tests is compared with several Cramer-von Mises type tests and the desired properties of the new tests are confirmed in high dimensions. In the second topic, a semi-parametric copula density estimation procedure that guarantees that the estimator is a genuine copula density is outlined. A simulation-based study is constructed to examine the performance of the proposed copula density estimation method and compare it with the leading copula density estimators in the literature. The method is also applied to estimate copula densities in two empirical cases. The third topic shows that the naive kernel estimator using matching data is not consistent in the two-sample varying coefficient model with missing data. A bias-corrected consistent estimator is proposed and the asymptotic theory is discussed. A simulation study is conducted to support the theoretical results
    corecore