47,062 research outputs found

    State-Space Inference and Learning with Gaussian Processes

    No full text
    State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. Copyright 2010 by the authors

    The pseudotemporal bootstrap for predicting glaucoma from cross-sectional visual field data

    Get PDF
    Progressive loss of the field of vision is characteristic of a number of eye diseases such as glaucoma, a leading cause of irreversible blindness in the world. Recently, there has been an explosion in the amount of data being stored on patients who suffer from visual deterioration, including visual field (VF) test, retinal image, and frequent intraocular pressure measurements. Like the progression of many biological and medical processes, VF progression is inherently temporal in nature. However, many datasets associated with the study of such processes are often cross sectional and the time dimension is not measured due to the expensive nature of such studies. In this paper, we address this issue by developing a method to build artificial time series, which we call pseudo time series from cross-sectional data. This involves building trajectories through all of the data that can then, in turn, be used to build temporal models for forecasting (which would otherwise be impossible without longitudinal data). Glaucoma, like many diseases, is a family of conditions and it is, therefore, likely that there will be a number of key trajectories that are important in understanding the disease. In order to deal with such situations, we extend the idea of pseudo time series by using resampling techniques to build multiple sequences prior to model building. This approach naturally handles outliers and multiple possible disease trajectories. We demonstrate some key properties of our approach on synthetic data and present very promising results on VF data for predicting glaucoma

    Genetic optimization of training sets for improved machine learning models of molecular properties

    Get PDF
    The training of molecular models of quantum mechanical properties based on statistical machine learning requires large datasets which exemplify the map from chemical structure to molecular property. Intelligent a priori selection of training examples is often difficult or impossible to achieve as prior knowledge may be sparse or unavailable. Ordinarily representative selection of training molecules from such datasets is achieved through random sampling. We use genetic algorithms for the optimization of training set composition consisting of tens of thousands of small organic molecules. The resulting machine learning models are considerably more accurate with respect to small randomly selected training sets: mean absolute errors for out-of-sample predictions are reduced to ~25% for enthalpies, free energies, and zero-point vibrational energy, to ~50% for heat-capacity, electron-spread, and polarizability, and by more than ~20% for electronic properties such as frontier orbital eigenvalues or dipole-moments. We discuss and present optimized training sets consisting of 10 molecular classes for all molecular properties studied. We show that these classes can be used to design improved training sets for the generation of machine learning models of the same properties in similar but unrelated molecular sets.Comment: 9 pages, 6 figure
    corecore