7,134 research outputs found
Sequential Gaussian Processes for Online Learning of Nonstationary Functions
Many machine learning problems can be framed in the context of estimating
functions, and often these are time-dependent functions that are estimated in
real-time as observations arrive. Gaussian processes (GPs) are an attractive
choice for modeling real-valued nonlinear functions due to their flexibility
and uncertainty quantification. However, the typical GP regression model
suffers from several drawbacks: i) Conventional GP inference scales
with respect to the number of observations; ii) updating a GP model
sequentially is not trivial; and iii) covariance kernels often enforce
stationarity constraints on the function, while GPs with non-stationary
covariance kernels are often intractable to use in practice. To overcome these
issues, we propose an online sequential Monte Carlo algorithm to fit mixtures
of GPs that capture non-stationary behavior while allowing for fast,
distributed inference. By formulating hyperparameter optimization as a
multi-armed bandit problem, we accelerate mixing for real time inference. Our
approach empirically improves performance over state-of-the-art methods for
online GP estimation in the context of prediction for simulated non-stationary
data and hospital time series data
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Efficient high-dimensional importance sampling in mixture frameworks
This paper provides high-dimensional and flexible importance sampling procedures for the likelihood evaluation of dynamic latent variable models involving finite or infinite mixtures leading to possibly heavy tailed and/or multi-modal target densities. Our approach is based upon the efficient importance sampling (EIS) approach of Richard and Zhang (2007) and exploits the mixture structure of the model when constructing importance sampling distributions as mixture of distributions. The proposed mixture EIS procedures are illustrated with ML estimation of a student-t state space model for realized volatilities and a stochastic volatility model with leverage effects and jumps for asset returns. --dynamic latent variable model,importance sampling,marginalized likelihood,mixture,Monte Carlo,realized volatility,stochastic volatility
- …