47,062 research outputs found
State-Space Inference and Learning with Gaussian Processes
State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. Copyright 2010 by the authors
The pseudotemporal bootstrap for predicting glaucoma from cross-sectional visual field data
Progressive loss of the field of vision is characteristic of a number of eye diseases such as glaucoma, a leading cause of irreversible blindness in the world. Recently, there has been an explosion in the amount of data being stored on patients who suffer from visual deterioration, including visual field (VF) test, retinal image, and frequent intraocular pressure measurements. Like the progression of many biological and medical processes, VF progression is inherently temporal in nature. However, many datasets associated with the study of such processes are often cross sectional and the time dimension is not measured due to the expensive nature of such studies. In this paper, we address this issue by developing a method to build artificial time series, which we call pseudo time series from cross-sectional data. This involves building trajectories through all of the data that can then, in turn, be used to build temporal models for forecasting (which would otherwise be impossible without longitudinal data). Glaucoma, like many diseases, is a family of conditions and it is, therefore, likely that there will be a number of key trajectories that are important in understanding the disease. In order to deal with such situations, we extend the idea of pseudo time series by using resampling techniques to build multiple sequences prior to model building. This approach naturally handles outliers and multiple possible disease trajectories. We demonstrate some key properties of our approach on synthetic data and present very promising results on VF data for predicting glaucoma
Genetic optimization of training sets for improved machine learning models of molecular properties
The training of molecular models of quantum mechanical properties based on
statistical machine learning requires large datasets which exemplify the map
from chemical structure to molecular property. Intelligent a priori selection
of training examples is often difficult or impossible to achieve as prior
knowledge may be sparse or unavailable. Ordinarily representative selection of
training molecules from such datasets is achieved through random sampling. We
use genetic algorithms for the optimization of training set composition
consisting of tens of thousands of small organic molecules. The resulting
machine learning models are considerably more accurate with respect to small
randomly selected training sets: mean absolute errors for out-of-sample
predictions are reduced to ~25% for enthalpies, free energies, and zero-point
vibrational energy, to ~50% for heat-capacity, electron-spread, and
polarizability, and by more than ~20% for electronic properties such as
frontier orbital eigenvalues or dipole-moments. We discuss and present
optimized training sets consisting of 10 molecular classes for all molecular
properties studied. We show that these classes can be used to design improved
training sets for the generation of machine learning models of the same
properties in similar but unrelated molecular sets.Comment: 9 pages, 6 figure
- …