3 research outputs found
Learning from High-Dimensional Multivariate Signals.
Modern measurement systems monitor a growing number of variables at low cost. In the problem
of characterizing the observed measurements, budget limitations usually constrain the number n of samples that one can acquire, leading to situations where the number p of variables is much larger than n. In this situation, classical statistical methods, founded on the assumption that n is large and p is fixed,
fail both in theory and in practice. A successful approach to overcome this problem is to assume a parsimonious generative model characterized by a number k of
parameters, where k is much smaller than p.
In this dissertation we develop algorithms to fit low-dimensional generative models
and extract relevant information from high-dimensional, multivariate signals. First,
we define extensions of the well-known Scalar Shrinkage-Thresholding Operator, that
we name Multidimensional and Generalized Shrinkage-Thresholding Operators, and
show that these extensions arise in numerous algorithms for structured-sparse linear and non-linear regression. Using convex optimization techniques, we show that
these operators, defined as the solutions to a class of convex, non-differentiable, optimization problems have an equivalent convex, low-dimensional reformulation. Our
equivalence results shed light on the behavior of a general class of penalties that includes classical sparsity-inducing penalties such as the LASSO and the Group LASSO.
In addition, our reformulation leads in some cases to new efficient algorithms for a
variety of high-dimensional penalized estimation problems.
Second, we introduce two new classes of low-dimensional factor models that account for temporal shifts commonly occurring in multivariate signals. Our first contribution, called Order Preserving Factor Analysis, can be seen as an extension of the
non-negative, sparse matrix factorization model to allow for order-preserving temporal translations in the data. We develop an efficient descent algorithm to fit this model
using techniques from convex and non-convex optimization. Our second contribution
extends Principal Component Analysis to the analysis of observations suffering from
circular shifts, and we call it Misaligned Principal Component Analysis. We
quantify the effect of the misalignments in the spectrum of the sample covariance matrix in the high-dimensional regime and develop simple algorithms to jointly estimate
the principal components and the misalignment parameters.Ph.D.Electrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91544/1/atibaup_1.pd
DNA methylation as a biomarker for age-related cognitive impairment
PhD ThesisDue to the ageing population, the number of patients diagnosed with age-related
diseases such as stroke and Parkinson’s disease are on the rise. In both post-stroke
dementia (PSD) and mild cognitive impairment in Parkinson’s disease (PD-MCI), the
mechanisms resulting in cognitive decline are unknown. This project aims to identify a
biomarker which could predict those patients most at risk of developing cognitive
decline, which would subsequently assist healthcare professionals in recommending
early treatment and care.
Epigenetics is an emerging field in which biomarkers have previously been useful in
prognostication of cancers and prediction of cardiovascular disease. In this study, 30
patients from a PSD cohort (COGFAST) and 48 patients from a PD-MCI cohort
(ICICLE) were analysed using the Illumina HumanMethylation450 BeadChip to
identify differentially methylated positions which could predict patients who would
later develop cognitive decline. Top hits were validated using Pyrosequencing to
confirm DNA methylation differences in a replication cohort.
Individual CpG sites within APOB and NGF were identified as potential blood-based
biomarkers for PSD and one CpG site within CHCHD5 was highlighted as a potential
blood-based biomarker for PD-MCI. In addition, methylation at one CpG site within
NGF and a CpG site (cg18837178) within a non-coding RNA, were found to be
associated with Braak staging (degree of brain pathology) using DNA from two brain
regions. NGF deregulation has previously been associated with Alzheimer’s disease,
and this finding indicates it may also have a role in the development of PSD.
These novel findings represent the first steps towards the identification of blood-based
biomarkers to assist with diagnosis of PSD and PD-MCI, but require further validation
in a larger independent cohort. The differentially methylated genes identified may also
give insight into some of the mechanisms involved in these complex diseases,
potentially leading to the future development of targeted preventative treatments.Medical Research Council and
Newcastle Universit