Search CORE

32 research outputs found

PROBABILISTIC AND GEOMETRIC APPROACHES TO THE ANALYSIS OF NON-STANDARD DATA

Author: Carmichael Iain
Publication venue: University of North Carolina at Chapel Hill Graduate School
Publication date: 01/01/2019
Field of study

This dissertation explores topics in machine learning, network analysis, and the foundations of statistics using tools from geometry, probability and optimization. The rise of machine learning has brought powerful new (and old) algorithms for data analysis. Much of classical statistics research is about understanding how statistical algorithms behave depending on various aspects of the data. The first part of this dissertation examines the support vector machine classifier (SVM). Leveraging Karush-Kuhn-Tucker conditions we find surprising connections between SVM and several other simple classifiers. We use these connections to explain SVM’s behavior in a variety of data scenarios and demonstrate how these insights are directly relevant to the data analyst. The next part of this dissertation studies networks which evolve over time. We first develop a method to empirically evaluate vertex centrality metrics in an evolving network. We then apply this methodology to investigate the role of precedent in the US legal system. Next, we shift to a probabilistic perspective on temporally evolving networks. We study a general probabilistic model of an evolving network that undergoes an abrupt change in its evolution dynamics. In particular, we examine the effect of such a change on the network’s structural properties. We develop mathematical techniques using continuous time branching processes to derive quantitative error bounds for functionals of a major class of these models about their large network limits. Using these results, we develop general theory to understand the role of abrupt changes in the evolution dynamics of these models. Based on this theory we derive a consistent, non-parametric change point detection estimator. We conclude with a discussion on foundational topics in statistics, commenting on debates both old and new. First, we examine the false confidence theorem which raises questions for data practitioners making inferences based on epistemic uncertainty measures such as Bayesian posterior distributions. Second, we give an overview of the rise of “data science" and what it means for statistics (and vice versa), touching on topics such as reproducibility, computation, education, communication and statistical theory.Doctor of Philosoph

Carolina Digital Repository

Education, skills and productivity: commissioned research first joint special report of the Business, Innovation and Skills and Education Committees of session 2015-16 second special report of the Business, Innovation and Skills Committee of session 2015-16; third special report of the Education Committee of session 2015-16

Author: Carmichael Neil
Iain Wright Iain
Publication venue: The Stationery Office
Publication date: 01/01/2015
Field of study

Digital Education Resource Archive

Examining the Evolution of Legal Precedent Through Citation Network Analysis

Author: Carmichael Iain
Jushchuk James
Kim Michael
Wudel James
Publication venue: Carolina Law Scholarship Repository
Publication date: 01/12/2017
Field of study

bepress Legal Repository

University of North Carolina School of Law

Joint and individual analysis of breast cancer histologic images and genomic covariates

Author: Calhoun Benjamin C.
Carmichael Iain
Couture Heather D.
Geradts Joseph
Hannig Jan
Hoadley Katherine A.
Marron J. S.
Niethammer Marc
Olsson Linnea
Perou Charles M.
Troester Melissa A.
Publication venue
Publication date: 13/04/2020
Field of study

A key challenge in modern data analysis is understanding connections between complex and differing modalities of data. For example, two of the main approaches to the study of breast cancer are histopathology (analyzing visual characteristics of tumors) and genetics. While histopathology is the gold standard for diagnostics and there have been many recent breakthroughs in genetics, there is little overlap between these two fields. We aim to bridge this gap by developing methods based on Angle-based Joint and Individual Variation Explained (AJIVE) to directly explore similarities and differences between these two modalities. Our approach exploits Convolutional Neural Networks (CNNs) as a powerful, automatic method for image feature extraction to address some of the challenges presented by statistical analysis of histopathology image data. CNNs raise issues of interpretability that we address by developing novel methods to explore visual modes of variation captured by statistical algorithms (e.g. PCA or AJIVE) applied to CNN features. Our results provide many interpretable connections and contrasts between histopathology and genetics

arXiv.org e-Print Archive

PubMed Central

Carolina Digital Repository

ScholarShip