107 research outputs found
Balancing flexibility and robustness in machine learning: semi-parametric methods and sparse linear models
Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, noviembre de 201
Bayesian Field Theory: Nonparametric Approaches to Density Estimation, Regression, Classification, and Inverse Quantum Problems
Bayesian field theory denotes a nonparametric Bayesian approach for learning
functions from observational data. Based on the principles of Bayesian
statistics, a particular Bayesian field theory is defined by combining two
models: a likelihood model, providing a probabilistic description of the
measurement process, and a prior model, providing the information necessary to
generalize from training to non-training data. The particular likelihood models
discussed in the paper are those of general density estimation, Gaussian
regression, clustering, classification, and models specific for inverse quantum
problems. Besides problem typical hard constraints, like normalization and
positivity for probabilities, prior models have to implement all the specific,
and often vague, "a priori" knowledge available for a specific task.
Nonparametric prior models discussed in the paper are Gaussian processes,
mixtures of Gaussian processes, and non-quadratic potentials. Prior models are
made flexible by including hyperparameters. In particular, the adaption of mean
functions and covariance operators of Gaussian process components is discussed
in detail. Even if constructed using Gaussian process building blocks, Bayesian
field theories are typically non-Gaussian and have thus to be solved
numerically. According to increasing computational resources the class of
non-Gaussian Bayesian field theories of practical interest which are
numerically feasible is steadily growing. Models which turn out to be
computationally too demanding can serve as starting point to construct easier
to solve parametric approaches, using for example variational techniques.Comment: 200 pages, 99 figures, LateX; revised versio
On variational approximations for frequentist and bayesian inference
Variational approximations are approximate inference techniques for complex statisticalmodels providing fast, deterministic alternatives to conventional methods that,however accurate, take much longer to run. We extend recent work concerning variationalapproximations developing and assessing some variational tools for likelihoodbased and Bayesian inference. In particular, the first part of this thesis employs a Gaussian
variational approximation strategy to handle frequentist generalized linear mixedmodels with general design random effects matrices such as those including spline basisfunctions. This method involves approximation to the distributions of random effectsvectors, conditional on the responses, via a Gaussian density. The second thread isconcerned with a particular class of variational approximations, known as mean fieldvariational Bayes, which is based upon a nonparametric product density restriction on
the approximating density. Algorithms for inference and fitting for models with elaborateresponses and structures are developed adopting the variational message passingperspective. The modularity of variational message passing is such that extensions tomodels with more involved likelihood structures and scalability to big datasets are relatively
simple. We also derive algorithms for models containing higher level randomeffects and non-normal responses, which are streamlined in support of computationalefficiency. Numerical studies and illustrations are provided, including comparisons witha Markov chain Monte Carlo benchmark
Big Data Analytics and Information Science for Business and Biomedical Applications II
The analysis of big data in biomedical, business and financial research has drawn much attention from researchers worldwide. This collection of articles aims to provide a platform for an in-depth discussion of novel statistical methods developed for the analysis of Big Data in these areas. Both applied and theoretical contributions to these areas are showcased
Flexible estimation of temporal point processes and graphs
Handling complex data types with spatial structures, temporal dependencies, or discrete values, is generally a challenge in statistics and machine learning. In the recent years, there has been an increasing need of methodological and theoretical work to analyse non-standard data types, for instance, data collected on protein structures, genes interactions, social networks or physical sensors. In this thesis, I will propose a methodology and provide theoretical guarantees for analysing two general types of discrete data emerging from interactive phenomena, namely temporal point processes and graphs.
On the one hand, temporal point processes are stochastic processes used to model event data, i.e., data that comes as discrete points in time or space where some phenomenon occurs. Some of the most successful applications of these discrete processes include online messages, financial transactions, earthquake strikes, and neuronal spikes. The popularity of these processes notably comes from their ability to model unobserved interactions and dependencies between temporally and spatially distant events. However, statistical methods for point processes generally rely on estimating a latent, unobserved, stochastic intensity process. In this context, designing flexible models and consistent estimation methods is often a challenging task.
On the other hand, graphs are structures made of nodes (or agents) and edges (or links), where an edge represents an interaction or relationship between two nodes. Graphs are ubiquitous to model real-world social, transport, and mobility networks, where edges can correspond to virtual exchanges, physical connections between places, or migrations across geographical areas. Besides, graphs are used to represent correlations and lead-lag relationships between time series, and local dependence between random objects. Graphs are typical examples of non-Euclidean data, where adequate distance measures, similarity functions, and generative models need to be formalised. In the deep learning community, graphs have become particularly popular within the field of geometric deep learning.
Structure and dependence can both be modelled by temporal point processes and graphs, although predominantly, the former act on the temporal domain while the latter conceptualise spatial interactions. Nonetheless, some statistical models combine graphs and point processes in order to account for both spatial and temporal dependencies. For instance, temporal point processes have been used to model the birth times of edges and nodes in temporal graphs. Moreover, some multivariate point processes models have a latent graph parameter governing the pairwise causal relationships between the components of
the process. In this thesis, I will notably study such a model, called the Hawkes model, as well as graphs evolving in time.
This thesis aims at designing inference methods that provide flexibility in the contexts of temporal point processes and graphs. This manuscript is presented in an integrated format, with four main chapters and two appendices. Chapters 2 and 3 are dedicated to the study of Bayesian nonparametric inference methods in the generalised Hawkes point process model. While Chapter 2 provides theoretical guarantees for existing methods, Chapter 3 also proposes, analyses, and evaluates a novel variational Bayes methodology. The other main chapters introduce and study model-free inference approaches for two estimation problems on graphs, namely spectral methods for the signed graph clustering problem in Chapter 4, and a deep learning algorithm for the network change point detection task on temporal graphs in Chapter 5.
Additionally, Chapter 1 provides an introduction and background preliminaries on point processes and graphs. Chapter 6 concludes this thesis with a summary and critical thinking on the works in this manuscript, and proposals for future research. Finally, the appendices contain two supplementary papers. The first one, in Appendix A, initiated after the COVID-19 outbreak in March 2020, is an application of a discrete-time Hawkes model to COVID-related deaths counts during the first wave of the pandemic. The second work, in Appendix B, was conducted during an internship at Amazon Research in 2021, and proposes an explainability method for anomaly detection models acting on multivariate time series
Information Fusion of Magnetic Resonance Images and Mammographic Scans for Improved Diagnostic Management of Breast Cancer
Medical imaging is critical to non-invasive diagnosis and treatment of a wide spectrum
of medical conditions. However, different modalities of medical imaging employ/apply
di erent contrast mechanisms and, consequently, provide different depictions of bodily
anatomy. As a result, there is a frequent problem where the same pathology can be
detected by one type of medical imaging while being missed by others. This problem brings
forward the importance of the development of image processing tools for integrating the
information provided by different imaging modalities via the process of information fusion.
One particularly important example of clinical application of such tools is in the diagnostic
management of breast cancer, which is a prevailing cause of cancer-related mortality in
women. Currently, the diagnosis of breast cancer relies mainly on X-ray mammography and
Magnetic Resonance Imaging (MRI), which are both important throughout different stages
of detection, localization, and treatment of the disease. The sensitivity of mammography,
however, is known to be limited in the case of relatively dense breasts, while contrast enhanced
MRI tends to yield frequent 'false alarms' due to its high sensitivity. Given this
situation, it is critical to find reliable ways of fusing the mammography and MRI scans in
order to improve the sensitivity of the former while boosting the specificity of the latter.
Unfortunately, fusing the above types of medical images is known to be a difficult computational
problem. Indeed, while MRI scans are usually volumetric (i.e., 3-D), digital
mammograms are always planar (2-D). Moreover, mammograms are invariably acquired
under the force of compression paddles, thus making the breast anatomy undergo sizeable
deformations. In the case of MRI, on the other hand, the breast is rarely constrained and
imaged in a pendulous state. Finally, X-ray mammography and MRI exploit two completely
di erent physical mechanisms, which produce distinct diagnostic contrasts which
are related in a non-trivial way. Under such conditions, the success of information fusion
depends on one's ability to establish spatial correspondences between mammograms
and their related MRI volumes in a cross-modal cross-dimensional (CMCD) setting in the
presence of spatial deformations (+SD). Solving the problem of information fusion in the
CMCD+SD setting is a very challenging analytical/computational problem, still in need
of efficient solutions.
In the literature, there is a lack of a generic and consistent solution to the problem of
fusing mammograms and breast MRIs and using their complementary information. Most
of the existing MRI to mammogram registration techniques are based on a biomechanical
approach which builds a speci c model for each patient to simulate the effect of mammographic
compression. The biomechanical model is not optimal as it ignores the common
characteristics of breast deformation across different cases. Breast deformation is essentially the planarization of a 3-D volume between two paddles, which is common in all
patients. Regardless of the size, shape, or internal con guration of the breast tissue, one
can predict the major part of the deformation only by considering the geometry of the
breast tissue. In contrast with complex standard methods relying on patient-speci c biomechanical
modeling, we developed a new and relatively simple approach to estimate the
deformation and nd the correspondences. We consider the total deformation to consist of
two components: a large-magnitude global deformation due to mammographic compression
and a residual deformation of relatively smaller amplitude. We propose a much simpler
way of predicting the global deformation which compares favorably to FEM in terms of
its accuracy. The residual deformation, on the other hand, is recovered in a variational
framework using an elastic transformation model.
The proposed algorithm provides us with a computational pipeline that takes breast
MRIs and mammograms as inputs and returns the spatial transformation which establishes
the correspondences between them. This spatial transformation can be applied in different
applications, e.g., producing 'MRI-enhanced' mammograms (which is capable of improving
the quality of surgical care) and correlating between different types of mammograms.
We investigate the performance of our proposed pipeline on the application of enhancing
mammograms by means of MRIs and we have shown improvements over the state of the
art
Nonparametric Bayesian analysis of some clustering problems
Nonparametric Bayesian models have been researched extensively in the past 10 years
following the work of Escobar and West (1995) on sampling schemes for Dirichlet processes.
The infinite mixture representation of the Dirichlet process makes it useful
for clustering problems where the number of clusters is unknown. We develop nonparametric
Bayesian models for two different clustering problems, namely functional
and graphical clustering.
We propose a nonparametric Bayes wavelet model for clustering of functional or
longitudinal data. The wavelet modelling is aimed at the resolution of global and
local features during clustering. The model also allows the elicitation of prior belief
about the regularity of the functions and has the ability to adapt to a wide range
of functional regularity. Posterior inference is carried out by Gibbs sampling with
conjugate priors for fast computation. We use simulated as well as real datasets to
illustrate the suitability of the approach over other alternatives.
The functional clustering model is extended to analyze splice microarray data.
New microarray technologies probe consecutive segments along genes to observe alternative
splicing (AS) mechanisms that produce multiple proteins from a single gene.
Clues regarding the number of splice forms can be obtained by clustering the functional
expression profiles from different tissues. The analysis was carried out on the Rosetta dataset (Johnson et al., 2003) to obtain a splice variant by tissue distribution
for all the 10,000 genes. We were able to identify a number of splice forms that appear
to be unique to cancer.
We propose a Bayesian model for partitioning graphs depicting dependencies
in a collection of objects. After suitable transformations and modelling techniques,
the problem of graph cutting can be approached by nonparametric Bayes clustering.
We draw motivation from a recent work (Dhillon, 2001) showing the equivalence of
kernel k-means clustering and certain graph cutting algorithms. It is shown that
loss functions similar to the kernel k-means naturally arise in this model, and the
minimization of associated posterior risk comprises an effective graph cutting strategy.
We present here results from the analysis of two microarray datasets, namely the
melanoma dataset (Bittner et al., 2000) and the sarcoma dataset (Nykter et al.,
2006)
- …