67 research outputs found
Theoretical properties of the log-concave maximum likelihood estimator of a multidimensional density
We present theoretical properties of the log-concave maximum likelihood
estimator of a density based on an independent and identically distributed
sample in . Our study covers both the case where the true
underlying density is log-concave, and where this model is misspecified. We
begin by showing that for a sequence of log-concave densities, convergence in
distribution implies much stronger types of convergence -- in particular, it
implies convergence in Hellinger distance and even in certain exponentially
weighted total variation norms. In our main result, we prove the existence and
uniqueness of a log-concave density that minimises the Kullback--Leibler
divergence from the true density over the class all log-concave densities, and
also show that the log-concave maximum likelihood estimator converges almost
surely in these exponentially weighted total variation norms to this minimiser.
In the case of a correctly specified model, this demonstrates a strong type of
consistency for the estimator; in a misspecified model, it shows that the
estimator converges to the log-concave density that is closest in the
Kullback--Leibler sense to the true density.Comment: 20 pages, 0 figure
LogConcDEAD: An R Package for Maximum Likelihood Estimation of a Multivariate Log-Concave Density
In this article we introduce the R package LogConcDEAD (Log-concave density estimation in arbitrary dimensions). Its main function is to compute the nonparametric maximum likelihood estimator of a log-concave density. Functions for plotting, sampling from the density estimate and evaluating the density estimate are provided. All of the functions available in the package are illustrated using simple, reproducible examples with simulated data.
LogConcDEAD: An R package for maximum likelihood estimation of a multivariate log-concave density
In this document we introduce the R package LogConcDEAD (Log-concave density estimation in arbitrary dimensions). Its main function is to compute the nonparametric maximum likelihood estimator of a log-concave density. Functions for plotting, sampling from the density estimate and evaluating the density estimate are provided. All of the functions available in the package are illustrated using simple, reproducible examples with simulated data
Maximum likelihood estimation of a multivariate log-concave density
Density estimation is a fundamental statistical problem. Many methods are either sensitive to model misspecification (parametric models) or difficult to calibrate, especially for multivariate data (nonparametric smoothing methods). We propose an alternative approach using maximum likelihood under a qualitative assumption on the shape of the density, specifically log-concavity. The class of log-concave densities includes many common parametric families and has desirable properties. For univariate data, these estimators are relatively well understood, and are gaining in popularity in theory and practice. We discuss extensions for multivariate data, which require different techniques.
After establishing existence and uniqueness of the log-concave maximum likelihood estimator for multivariate data, we see that a reformulation allows us to compute it using standard convex optimization techniques. Unlike kernel density estimation, or other nonparametric smoothing methods, this is a fully automatic procedure, and no additional tuning parameters are required.
Since the assumption of log-concavity is non-trivial, we introduce a method for assessing the suitability of this shape constraint and apply it to several simulated datasets and one real dataset. Density estimation is often one stage in a more complicated statistical procedure. With this in mind, we show how the estimator may be used for plug-in estimation of statistical functionals. A second important extension is the use of log-concave components in mixture models. We illustrate how we may use an EM-style algorithm to fit mixture models where the number of components is known. Applications to visualization and classification are presented. In the latter case, improvement over a Gaussian mixture model is demonstrated.
Performance for density estimation is evaluated in two ways. Firstly, we consider Hellinger convergence (the usual metric of theoretical convergence results for nonparametric maximum likelihood estimators). We prove consistency with respect to this metric and heuristically discuss rates of convergence and model misspecification, supported by empirical investigation. Secondly, we use the mean integrated squared error to demonstrate favourable performance compared with kernel density estimates using a variety of bandwidth selectors, including sophisticated adaptive methods.
Throughout, we emphasise the development of stable numerical procedures able to handle the additional complexity of multivariate data
Estimating the Effect of Liver and Pancreas Volume and Fat Content on Risk of Diabetes: A Mendelian Randomization Study
Fat content and volume of liver and pancreas are associated with risk of diabetes in observational studies; whether these associations are causal is unknown. We conducted a Mendelian randomization (MR) study to examine causality of such associations. We used genetic variants associated (P < 5 Γ 10-8) with the exposures (liver and pancreas volume and fat content) using MRI scans of UK Biobank participants (n = 32,859). We obtained summary-level data for risk of type 1 (9,358 cases) and type 2 (55,005 cases) diabetes from the largest available genome-wide association studies. We performed inverse-variance weighted MR as main analysis and several sensitivity analyses to assess pleiotropy and to exclude variants with potential pleiotropic effects. Observationally, liver fat and volume were associated with type 2 diabetes (odds ratio per 1 SD higher exposure 2.16 [2.02, 2.31] and 2.11 [1.96, 2.27], respectively). Pancreatic fat was associated with type 2 diabetes (1.42 [1.34, 1.51]) but not type 1 diabetes, and pancreas volume was negatively associated with type 1 diabetes (0.42 [0.36, 0.48]) and type 2 diabetes (0.73 [0.68, 0.78]). MR analysis provided evidence only for a causal role of liver fat and pancreas volume in risk of type 2 diabetes (1.27 [1.08, 1.49] or 27% increased risk and 0.76 [0.62, 0.94] or 24% decreased risk per 1SD, respectively) and no causal associations with type 1 diabetes. Our findings assist in understanding the causal role of ectopic fat in the liver and pancreas and of organ volume in the pathophysiology of type 1 and 2 diabetes. [Abstract copyright: Β© 2022 by the American Diabetes Association.
Differing genetic variants associated with liver fat and their contrasting relationships with cardiovascular diseases and cancer.
The underlying mechanisms for the link between steatotic liver disease and cardiovascular and cancer outcomes are poorly understood. We aimed to use MRI-derived measures of liver fat and genetics to investigate causal mechanisms that link higher liver fat to various health outcomes. We conducted a genome-wide association study on 37,358 UK Biobank participants to identify genetic variants associated with liver fat measured from MRI scans. We used Mendelian randomization approach to investigate the causal effect of liver fat on health outcomes independent of BMI, alcohol consumption and lipids using data from published GWAS and FinnGen. We identified 13 genetic variants associated with liver fat that showed differing risks to health outcomes. Genetic variants associated with impaired hepatic triglyceride export showed liver fat-increasing alleles to be correlated with a reduced risk of coronary artery disease and myocardial infarction but an elevated risk of type 2 diabetes; and variants associated with enhanced de novo lipogenesis showed liver fat-increasing alleles to be linked to a higher risk of myocardial infarction and coronary artery disease. Genetically higher liver fat content increased the risk of non-alcohol liver cirrhosis, hepatocellular and Intrahepatic bile ducts and gallbladder cancers, exhibiting a dose-dependent relationship, irrespective of the mechanism. This study provides fresh insight into the heterogeneous effect of liver fat on health outcomes. It challenges the notion that liver fat per se is an independent risk factor for cardiovascular disease, underscoring the dependency of this association on the specific mechanisms that drive fat accumulation in the liver. However, excess liver fat, regardless of how achieved, appears to be causally linked to liver cirrhosis and cancers in a dose dependent manner. This research advances our understanding of the heterogeneity in mechanisms influencing liver fat accumulation, providing new insights into how liver fat accumulation may impact various health outcomes. The findings challenge the notion that liver fat is an independent risk factor for cardiovascular disease and highlight the mechanistic effect of some genetic variants on fat accumulation and the development of cardiovascular diseases. This study is of particular importance for healthcare professionals including physicians and researchers as well as patients as it allows for more targeted and personalised treatment by understanding the relationship between liver fat and various health outcomes. The findings emphasise the need for a personalised management approach and a reshaping of risk assessment criteria. It also provides room for prioritising a clinical intervention aimed at reducing liver fat content (likely by intentional weight loss, however, achieved) that could help protect against liver related fibrosis and cancer
Towards Eliminating Bias in Cluster Analysis of TB Genotyped Data
The relative contributions of transmission and reactivation of latent infection to TB cases observed clinically has been reported in many situations, but always with some uncertainty. Genotyped data from TB organisms obtained from patients have been used as the basis for heuristic distinctions between circulating (clustered strains) and reactivated infections (unclustered strains). NaΓ―ve methods previously applied to the analysis of such data are known to provide biased estimates of the proportion of unclustered cases. The hypergeometric distribution, which generates probabilities of observing clusters of a given size as realized clusters of all possible sizes, is analyzed in this paper to yield a formal estimator for genotype cluster sizes. Subtle aspects of numerical stability, bias, and variance are explored. This formal estimator is seen to be stable with respect to the epidemiologically interesting properties of the cluster size distribution (the number of clusters and the number of singletons) though it does not yield satisfactory estimates of the number of clusters of larger sizes. The problem that even complete coverage of genotyping, in a practical sampling frame, will only provide a partial view of the actual transmission network remains to be explored
Genetic evidence for distinct biological mechanisms that link adiposity to type 2 diabetes: toward precision medicine
We aimed to unravel the mechanisms connecting adiposity to type 2 diabetes. We used MR-Clust to cluster independent genetic variants associated with body fat percentage (388 variants) and BMI (540 variants) based on their impact on type 2 diabetes. We identified five clusters of adiposity-increasing alleles associated with higher type 2 diabetes risk (unfavorable adiposity) and three clusters associated with lower risk (favorable adiposity). We then characterized each cluster based on various biomarkers, metabolites, and MRI-based measures of fat distribution and muscle quality. Analyzing the metabolic signatures of these clusters revealed two primary mechanisms connecting higher adiposity to reduced type 2 diabetes risk. The first involves higher adiposity in subcutaneous tissues (abdomen and thigh), lower liver fat, improved insulin sensitivity, and decreased risk of cardiometabolic diseases and diabetes complications. The second mechanism is characterized by increased body size and enhanced muscle quality, with no impact on cardiometabolic outcomes. Furthermore, our findings unveil diverse mechanisms linking higher adiposity to higher disease risk, such as cholesterol pathways or inflammation. These results reinforce the existence of adiposity-related mechanisms that may act as protective factors against type 2 diabetes and its complications, especially when accompanied by reduced ectopic liver fat
- β¦