Search CORE

51,798 research outputs found

Sparse Bayesian kernel learning for high-dimensional regression and classification

Author: Duan Weikang
Publication venue
Publication date
Field of study

Doctor of PhilosophyDepartment of StatisticsGyuhyeong GohIn the past decades, statistical learning has been an increasingly popular topic that has drawn a significant amount of attention from researchers. Kernel-based nonlinear models, in particular, are powerful tools due to their flexibility to extract information from complex datasets. A major challenge with kernel modeling in the current big data era is the curse of dimensionality. Although an abundance of variable selection methods have been proposed, the developments in high-dimensional Bayesian kernel models is still in its infancy. In addition to the variable selection, the innate nature of kernel-based models induces heavy computational costs, which further prohibit the application of related methods. The goal of this dissertation is to develop new, fast variable selection and prediction procedures in order to address the problem of high-dimensional nonlinear regression and classification from the Bayesian perspective. To reduce the computational cost, we propose a novel hybrid search algorithm and the Bayesian doubly-sparse frameworks to the kernel-based models. In Chapter 1, we discuss the background, existing methods, and their limitations. We also give the motivation for our study. In Chapter 2, we propose a Bayesian model hybrid search algorithm for Gaussian process (GP) regression models, which quickly scans through the model space to search for a set of models with high posterior probabilities. In addition, we address the massive and high-dimensional data problem for GP by proposing an approach which combines quantile subsample hybrid search with a nearest neighbor GP scheme. In Chapter 3, we propose a novel Bayesian doubly-sparse framework to the reproducing kernel Hilbert space (RKHS) regression models. The proposed doubly-sparse framework performs both variable selection and sparse kernel matrix estimation. In Chapter 4, we extend our proposed Bayesian doubly-sparse framework to the nonlinear Bayesian support vector machine

K-State Research Exchange

Using neutral cline decay to estimate contemporary dispersal: a generic tool and its application to a major crop pathogen

Author: Amil
Barton
Barton
Barton
Barton
Barton
Broquet
Brown
Burt
Carlier
Chen
Daguin
Dieckmann
Endler
Fisher
Gay
Giraud
Goudet
Guillot
Guillot
Halkett
Jones
Lapeyre de Bellaire
Lebreton
Lenormand
Lenormand
Lenormand
Lenormand
Lockwood
McCartney
McDonald
Mourichon
Neu
Pennisi
Rieux
Rieux
Robert
Ronce
Rousset
Rousset
Rousset
Saccheri
Sackett
Slatkin
Storey
Turelli
Zapater
Publication venue
Publication date: 01/01/2013
Field of study

Dispersal is a key parameter of adaptation, invasion and persistence. Yet standard population genetics inference methods hardly distinguish it from drift and many species cannot be studied by direct mark-recapture methods. Here, we introduce a method using rates of change in cline shapes for neutral markers to estimate contemporary dispersal. We apply it to the devastating banana pest Mycosphaerella fijiensis, a wind-dispersed fungus for which a secondary contact zone had previously been detected using landscape genetics tools. By tracking the spatio-temporal frequency change of 15 microsatellite markers, we find that σ, the standard deviation of parent–offspring dispersal distances, is 1.2 km/generation1/2. The analysis is further shown robust to a large range of dispersal kernels. We conclude that combining landscape genetics approaches to detect breaks in allelic frequencies with analyses of changes in neutral genetic clines offers a powerful way to obtain ecologically relevant estimates of dispersal in many species

Power System Parameters Forecasting Using Hilbert-Huang Transform and Machine Learning

Author: Kurbatsky Victor
Leahy Paul
Sidorov Denis
Spiryaev Vadim
Tomin Nikita
Zhukov Alexei
Publication venue
Publication date: 01/04/2014
Field of study

A novel hybrid data-driven approach is developed for forecasting power system parameters with the goal of increasing the efficiency of short-term forecasting studies for non-stationary time-series. The proposed approach is based on mode decomposition and a feature analysis of initial retrospective data using the Hilbert-Huang transform and machine learning algorithms. The random forests and gradient boosting trees learning techniques were examined. The decision tree techniques were used to rank the importance of variables employed in the forecasting models. The Mean Decrease Gini index is employed as an impurity function. The resulting hybrid forecasting models employ the radial basis function neural network and support vector regression. Apart from introduction and references the paper is organized as follows. The section 2 presents the background and the review of several approaches for short-term forecasting of power system parameters. In the third section a hybrid machine learning-based algorithm using Hilbert-Huang transform is developed for short-term forecasting of power system parameters. Fourth section describes the decision tree learning algorithms used for the issue of variables importance. Finally in section six the experimental results in the following electric power problems are presented: active power flow forecasting, electricity price forecasting and for the wind speed and direction forecasting

arXiv.org e-Print Archive

CiteSeerX

Irish Universities

Directory of Open Access Journals

Cork Open Research Archive