Search CORE

1,778 research outputs found

Nonparametric Regression via StatLSSVM

Author: De Brabanter Kris
De Brabanter Kris
De Moor Bart
Suykens Johan
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2013
Field of study

We present a new MATLAB toolbox under Windows and Linux for nonparametric regression estimation based on the statistical library for least squares support vector machines (StatLSSVM). The StatLSSVM toolbox is written so that only a few lines of code are necessary in order to perform standard nonparametric regression, regression with correlated errors and robust regression. In addition, construction of additive models and pointwise or uniform confidence intervals are also supported. A number of tuning criteria such as classical cross-validation, robust cross-validation and cross-validation for correlated errors are available. Also, minimization of the previous criteria is available without any user interaction

Digital Repository @ Iowa State University (ISU)

Crossref

Directory of Open Access Journals

Journal of Statistical Software

Semi-Supervised Kernel PCA

Author: Christian Walder
Lars Kai Hansen
Mathematical Modelling
Morten Mørup
Ricardo Henao
Publication venue
Publication date: 01/01/2010
Field of study

We present three generalisations of Kernel Principal Components Analysis (KPCA) which incorporate knowledge of the class labels of a subset of the data points. The first, MV-KPCA, penalises within class variances similar to Fisher discriminant analysis. The second, LSKPCA is a hybrid of least squares regression and kernel PCA. The final LR-KPCA is an iteratively reweighted version of the previous which achieves a sigmoid loss function on the labeled points. We provide a theoretical risk bound as well as illustrative experiments on real and toy data sets

arXiv.org e-Print Archive

CiteSeerX

Online Research Database In Technology

Kernel learning at the first level of inference

Author: Cawley G.C.
Talbot N.L.C.
Publication venue: 'Elsevier BV'
Publication date: 02/02/2014
Field of study

Kernel learning methods, whether Bayesian or frequentist, typically involve multiple levels of inference, with the coefficients of the kernel expansion being determined at the first level and the kernel and regularisation parameters carefully tuned at the second level, a process known as model selection. Model selection for kernel machines is commonly performed via optimisation of a suitable model selection criterion, often based on cross-validation or theoretical performance bounds. However, if there are a large number of kernel parameters, as for instance in the case of automatic relevance determination (ARD), there is a substantial risk of over-fitting the model selection criterion, resulting in poor generalisation performance. In this paper we investigate the possibility of learning the kernel, for the Least-Squares Support Vector Machine (LS-SVM) classifier, at the first level of inference, i.e.parameter optimisation. The kernel parameters and the coefficients of the kernel expansion are jointly optimised at the first level of inference, minimising a training criterion with an additional regularisation term acting on the kernel parameters. The key advantage of this approach is that the values of only two regularisation parameters need be determined in model selection, substantially alleviating the problem of over-fitting the model selection criterion. The benefits of this approach are demonstrated using a suite of synthetic and real-world binary classification benchmark problems, where kernel learning at the first level of inference is shown to be statistically superior to the conventional approach, improves on our previous work (Cawley and Talbot, 2007) and is competitive with Multiple Kernel Learning approaches, but with reduced computational expense

Crossref

University of East Anglia digital repository

Current Mathematical Methods Used in QSAR/QSPR Studies

Author: Agatonovic-Kustrin
Agrafiotis
Bhonsle
Carlucci
Chen
Cheng
Cheng
Cheng
Cho
Cortes
Davies
Deeb
Du
Du
Du
Du
Du
Du
Du
Du
Elliott
Equbal
Fan
Fisz
Friedman
Gharagheizi
Gharagheizi
Gharagheizi
Gharagheizi
Gharagheizi
Gharagheizi
Gharagheizi
Gharagheizi
Gharagheizi
Gong
Goudarzi
Guha
Gunturi
Hashemianzadeh
Ibric
Ji
Jores-Kong
Joseph
Jung
Kahn
Kansal
Karimi
Katritzky
Leonard
Leonard
Li
Li
Li
Liang
Liu
Liu
Liu
Lu
Lu
Luan
Luan
Luan
Luan
Luan
Luan
Luan
Ma
Ma
Mager
Mandal
Niazi
Niazi
Niazi
Niazi
Niazi
Nunthanavanit
Om
Peixun Liu
Priolo
Psihogios
Qi
Qin
Rebehmed
Ren
Ren
Ren
Riahi
Riahi
Rogers
Rouhollahi
Roy
Roy
Samadi-Maybodi
Samee
Sammi
Sattari
Shi
Si
Si
Si
Si
Si
Specht
Srivani
Suykens
Szaleniec
Tetko
Thomas Leonard
Vapnik
Vatani
Wang
Wang
Wei Long
Wold
Word
Xia
Xia
Xia
Xia
Yang
Yap
Yin
Yuan
Yuan
Yuan
Yuan
Zhao
Zhao
Zhao
Zhao
Publication venue: Molecular Diversity Preservation International (MDPI)
Publication date: 22/12/2009
Field of study

This paper gives an overview of the mathematical methods currently used in quantitative structure-activity/property relationship (QASR/QSPR) studies. Recently, the mathematical methods applied to the regression of QASR/QSPR models are developing very fast, and new methods, such as Gene Expression Programming (GEP), Project Pursuit Regression (PPR) and Local Lazy Regression (LLR) have appeared on the QASR/QSPR stage. At the same time, the earlier methods, including Multiple Linear Regression (MLR), Partial Least Squares (PLS), Neural Networks (NN), Support Vector Machine (SVM) and so on, are being upgraded to improve their performance in QASR/QSPR studies. These new and upgraded methods and algorithms are described in detail, and their advantages and disadvantages are evaluated and discussed, to show their application potential in QASR/QSPR studies in the future

CiteSeerX

Crossref

PubMed Central

Automating Large-Scale Simulation Calibration to Real-World Sensor Data

Author: Edwards Richard Everett
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2013
Field of study

Many key decisions and design policies are made using sophisticated computer simulations. However, these sophisticated computer simulations have several major problems. The two main issues are 1) gaps between the simulation model and the actual structure, and 2) limitations of the modeling engine\u27s capabilities. This dissertation\u27s goal is to address these simulation deficiencies by presenting a general automated process for tuning simulation inputs such that simulation output matches real world measured data. The automated process involves the following key components -- 1) Identify a model that accurately estimates the real world simulation calibration target from measured sensor data; 2) Identify the key real world measurements that best estimate the simulation calibration target; 3) Construct a mapping from the most useful real world measurements to actual simulation outputs; 4) Build fast and effective simulation approximation models that predict simulation output using simulation input; 5) Build a relational model that captures inter variable dependencies between simulation inputs and outputs; and finally 6) Use the relational model to estimate the simulation input variables from the mapped sensor data, and use either the simulation model or approximate simulation model to fine tune input simulation parameter estimates towards the calibration system. The work in this dissertation individually validates and completes five out of the six calibration components with respect to the residential energy domain. Step 1 is satisfied by identifying the best model for predicting next hour residential electrical consumption, the calibration target. Step 2 is completed by identifying the most important sensors for predicting residential electrical consumption, the real world measurements. While step 3 is completed by domain experts, step 4 is addressed by using techniques from the Big Data machine learning domain to build approximations for the EnergyPlus (E+) simulator. Step 5\u27s solution leverages the same Big Data machine learning techniques to build a relational model that describes how the simulator\u27s variables are probabilistically related. Finally, step 6 is partially demonstrated by using the relational model to estimate simulation parameters for E+ simulations with known ground truth simulation inputs

University of Tennessee, Knoxville: Trace