15 research outputs found

    Manifold-valued data analysis of networks and shapes

    Get PDF
    This thesis is concerned with the study of manifold-valued data analysis. Manifold-valued data is a type of multivariate data that lies on a manifold as opposed to a Euclidean space. We seek to develop analogue classical multivariate analysis methods, which are appropriate for Euclidean data, for data that lie on particular manifolds. A manifold we particularly focus on is the manifold of graph Laplacians. Graph Laplacians can represent networks and for the majority of this thesis we focus on the statistical analysis of samples of networks by identifying networks with their graph Laplacian matrices. We develop a general framework for extrinsic statistical analysis of samples of networks by this representation. For the graph Laplacians we define metrics, embeddings, tangent spaces, and a projection from Euclidean space to the space of graph Laplacians. This framework provides a way of computing means, performing principal component analysis and regression, carrying out hypothesis tests, such as for testing for equality of means between two samples of networks, and classifying networks. We will demonstrate these methods on many different network datasets, including networks derived from text and neuroimaging data. We also briefly consider another well studied type of manifold-valued data, namely shape data, comparing three commonly used tangent coordinates used in shape analysis and explaining the difference between them and why they may not all be suitable to always use

    Manifold-valued data analysis of networks and shapes

    Get PDF
    This thesis is concerned with the study of manifold-valued data analysis. Manifold-valued data is a type of multivariate data that lies on a manifold as opposed to a Euclidean space. We seek to develop analogue classical multivariate analysis methods, which are appropriate for Euclidean data, for data that lie on particular manifolds. A manifold we particularly focus on is the manifold of graph Laplacians. Graph Laplacians can represent networks and for the majority of this thesis we focus on the statistical analysis of samples of networks by identifying networks with their graph Laplacian matrices. We develop a general framework for extrinsic statistical analysis of samples of networks by this representation. For the graph Laplacians we define metrics, embeddings, tangent spaces, and a projection from Euclidean space to the space of graph Laplacians. This framework provides a way of computing means, performing principal component analysis and regression, carrying out hypothesis tests, such as for testing for equality of means between two samples of networks, and classifying networks. We will demonstrate these methods on many different network datasets, including networks derived from text and neuroimaging data. We also briefly consider another well studied type of manifold-valued data, namely shape data, comparing three commonly used tangent coordinates used in shape analysis and explaining the difference between them and why they may not all be suitable to always use

    Modeling Holistic Marks With Analytic Rubrics

    Get PDF

    Manifold valued data analysis of samples of networks, with applications in corpus linguistics

    Get PDF
    Networks arise in many applications, such as in the analysis of text documents, social interactions and brain activity. We develop a general framework for extrinsic statistical analysis of samples of networks, motivated by networks representing text documents in corpus linguistics. We identify networks with their graph Laplacian matrices, for which we define metrics, embeddings, tangent spaces, and a projection from Euclidean space to the space of graph Laplacians. This framework provides a way of computing means, performing principal component analysis and regression, and carrying out hypothesis tests, such as for testing for equality of means between two samples of networks. We apply the methodology to the set of novels by Jane Austen and Charles Dickens

    Non‐parametric regression for networks

    Get PDF
    Network data are becoming increasingly available, and so there is a need to develop suitable methodology for statistical analysis. Networks can be represented as graph Laplacian matrices, which are a type of manifold-valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example in dynamic networks where the covariate is time. We develop an adapted Nadaraya-Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co-occurrence networks

    Smoothing splines on Riemannian manifolds, with applications to 3D shape space

    Get PDF
    © 2020 The Authors. Journal of the Royal Statistical Society: Series B (Statistical Methodology) published by John Wiley & Sons Ltd on behalf of Royal Statistical Society There has been increasing interest in statistical analysis of data lying in manifolds. This paper generalizes a smoothing spline fitting method to Riemannian manifold data based on the technique of unrolling, unwrapping and wrapping originally proposed by Jupp and Kent for spherical data. In particular, we develop such a fitting procedure for shapes of configurations in general m-dimensional Euclidean space, extending our previous work for two-dimensional shapes. We show that parallel transport along a geodesic on Kendall shape space is linked to the solution of a homogeneous first-order differential equation, some of whose coefficients are implicitly defined functions. This finding enables us to approximate the procedure of unrolling and unwrapping by simultaneously solving such equations numerically, and so to find numerical solutions for smoothing splines fitted to higher dimensional shape data. This fitting method is applied to the analysis of some dynamic 3D peptide data

    Association of age, hormonal, and lifestyle factors with the Leydig cell biomarker INSL3 in aging men from the European Male Aging Study cohort

    Get PDF
    Background: Aging in men is accompanied by a broad range of symptoms, including sexual dysfunction, cognitive and musculoskeletal decline, obesity, type 2 diabetes, cardiovascular disease and hypertension, organ degeneration/failure, and increasing neoplasia, some of which are associated with declining levels of Leydig cell-produced testosterone. High natural biological variance, together with multiple factors that can modulate circulating testosterone concentration, may influence its interpretation and clinical implications. Insulin-like peptide 3 is a biomarker of Leydig cell function that might provide complementary information on testicular health and its downstream outcomes. Objectives: To characterize insulin-like peptide 3 as a biomarker to assess gonadal status in aging men. Methods and materials: A large European multicenter (European Male Aging Study) cohort of community-dwelling men was analyzed to determine how insulin-like peptide 3 relates to a range of hormonal, anthropometric, and lifestyle parameters. Results and discussion: Insulin-like peptide 3 declines cross-sectionally and longitudinally within individuals at approximately 15% per decade from age 40 years, unlike testosterone (1.9% per decade), which is partly compensated by increasing pituitary luteinizing hormone production. Importantly, lower insulin-like peptide 3 in younger men appears to persist with aging. Multiple regression analysis shows that, unlike testosterone, insulin-like peptide 3 is negatively dependent on luteinizing hormone and sex hormone-binding globulin and positively dependent on follicle-stimulating hormone, suggesting a different mechanism of gonadotropic regulation. Circulating insulin-like peptide 3 is negatively associated with increased body mass index or waist circumference and with smoking, and unlike testosterone, it is not affected by weight loss in obese individuals. Geographic variation in mean insulin-like peptide 3 within Europe appears to be largely explained by differences in these parameters. The results allowed the establishment of a European-wide reference range for insulin-like peptide 3 (95% confidence interval) adjusted for increasing age. Conclusion: Insulin-like peptide 3 is a constitutive biomarker of Leydig cell functional capacity and is a robust, reliably measurable peptide not subject to gonadotropin-dependent short-term regulation and within-individual variation in testosterone

    The Leydig cell biomarker INSL3 as a predictor of age-related morbidity: Findings from the EMAS cohort

    Get PDF
    Background: Insulin-like peptide 3 (INSL3) is a constitutive hormone secreted in men by the mature Leydig cells of the testes. It is an accurate biomarker for Leydig cell functional capacity, reflecting their total cell number and differentiation status. Objectives: To determine the ability of INSL3 to predict hypogonadism and age-related morbidity using the EMAS cohort of older community-dwelling men. Materials & methods: Circulating INSL3 was assessed in the EMAS cohort and its cross-sectional and longitudinal relationships to hypogonadism, here defined by testosterone (T) <10.5nmol/l, and a range of age-related morbidities determined by correlation and regression analysis. Results & discussion: While INSL3 is an accurate measure of primary hypogonadism, secondary and compensated hypogonadism also indicate reduced levels of INSL3, implying that testicular hypogonadism does not improve even when LH levels are increased, and that ageing-related hypogonadism may combine both primary and secondary features. Unadjusted, serum INSL3, like calculated free testosterone (cFT), LH, or the T/LH ratio reflects hypogonadal status and is associated with reduced sexual function, bone mineral density, and physical activity, as well as increased occurrence of hypertension, cardiovascular disease, cancer, and diabetes. Using multiple regression analysis to adjust for a range of hormonal, anthropometric, and lifestyle factors, this relationship is lost for all morbidities, except for reduced bone mineral density, implying that INSL3 and/or its specific receptor, RXFP2, may be causally involved in promoting healthy bone metabolism. Elevated INSL3 also associates with hypertension and cardiovascular disease. When unadjusted, INSL3 in phase 1 of the EMAS study was assessed for its association with morbidity in phase 2 (mean 4.3 years later); INSL3 significantly predicts 7 out of 9 morbidity categories, behaving as well as cFT in this regard. In contrast, total T was predictive in only 3 of the 9 categories. Conclusion: Together with its low within-individual variance, these findings suggest that assessing INSL3 in men could offer important insight into the later development of disease in the elderly

    Level of agreement between frequently used cardiovascular risk calculators in people living with HIV

    Get PDF
    Objectives The aim of the study was to describe agreement between the QRISK2, Framingham and Data Collection on Adverse Events of Anti‐HIV Drugs (D:A:D) cardiovascular disease (CVD) risk calculators in a large UK study of people living with HIV (PLWH). Methods PLWH enrolled in the Pharmacokinetic and Clinical Observations in People over Fifty (POPPY) study without a prior CVD event were included in this study. QRISK2, Framingham CVD and the full and reduced D:A:D CVD scores were calculated; participants were stratified into ‘low’ ( 20%) categories for each. Agreement between scores was assessed using weighted kappas and Bland–Altman plots. Results The 730 included participants were predominantly male (636; 87.1%) and of white ethnicity (645; 88.5%), with a median age of 53 [interquartile range (IQR) 49–59] years. The median calculated 10‐year CVD risk was 11.9% (IQR 6.8–18.4%), 8.9% (IQR 4.6–15.0%), 8.5% (IQR 4.8–14.6%) and 6.9% (IQR 4.1–11.1%) when using the Framingham, QRISK2, and full and reduced D:A:D scores, respectively. Agreement between the different scores was generally moderate, with the highest level of agreement being between the Framingham and QRISK2 scores (weighted kappa = 0.65) but with most other kappa coefficients in the 0.50–0.60 range. Conclusions Estimates of predicted 10‐year CVD risk obtained with commonly used CVD risk prediction tools demonstrate, in general, only moderate agreement among PLWH in the UK. While further validation with clinical endpoints is required, our findings suggest that care should be taken when interpreting any score alone
    corecore