42,788 research outputs found
Convergence rates of Kernel Conjugate Gradient for random design regression
We prove statistical rates of convergence for kernel-based least squares
regression from i.i.d. data using a conjugate gradient algorithm, where
regularization against overfitting is obtained by early stopping. This method
is related to Kernel Partial Least Squares, a regression method that combines
supervised dimensionality reduction with least squares projection. Following
the setting introduced in earlier related literature, we study so-called "fast
convergence rates" depending on the regularity of the target regression
function (measured by a source condition in terms of the kernel integral
operator) and on the effective dimensionality of the data mapped into the
kernel space. We obtain upper bounds, essentially matching known minimax lower
bounds, for the (prediction) norm as well as for the stronger
Hilbert norm, if the true regression function belongs to the reproducing kernel
Hilbert space. If the latter assumption is not fulfilled, we obtain similar
convergence rates for appropriate norms, provided additional unlabeled data are
available
Kernel-Partial Least Squares regression coupled to pseudo-sample trajectories for the analysis of mixture designs of experiments
[EN] This article explores the potential of Kernel-Partial Least Squares (K-PLS) regression for the analysis of data proceeding from mixture designs of experiments. Gower's idea of pseudo-sample trajectories is exploited for interpretation purposes. The results show that, when the datasets under study are affected by severe nonlinearities and comprise few observations, the proposed approach can represent a feasible lternative to classical methodologies (i.e. Scheffe polynomial fitting by means of Ordinary Least Squares - OLS - and Cox polynomial fitting by means of Partial Least Squares - PLS). Furthermore, a way of recovering the parameters of a Scheffe model (provided that it holds and has the same complexity as the K-PLS one) from the trend of the aforementioned pseudo-sample trajectories is illustrated via a simulated case-study.This research work was partially supported by the Spanish Ministry of Economy and Competitiveness under the project DPI2014-55276-C5-1R and Shell Global Solutions International B.V. (Amsterdam, The Netherlands).Vitale, R.; Palací-López, DG.; Kerkenaar, H.; Postma, G.; Buydens, L.; Ferrer, A. (2018). Kernel-Partial Least Squares regression coupled to pseudo-sample trajectories for the analysis of mixture designs of experiments. Chemometrics and Intelligent Laboratory Systems. 175:37-46. https://doi.org/10.1016/j.chemolab.2018.02.002S374617
Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression
With the growing availability of large-scale biological datasets, automated methods of extracting functionally meaningful information from this data are becoming increasingly important. Data relating to functional association between genes or proteins, such as co-expression or functional association, is often represented in terms of gene or protein networks. Several methods of predicting gene function from these networks have been proposed. However, evaluating the relative performance of these algorithms may not be trivial: concerns have been raised over biases in different benchmarking methods and datasets, particularly relating to non-independence of functional association data and test data. In this paper we propose a new network-based gene function prediction algorithm using a commute-time kernel and partial least squares regression (Compass). We compare Compass to GeneMANIA, a leading network-based prediction algorithm, using a number of different benchmarks, and find that Compass outperforms GeneMANIA on these benchmarks. We also explicitly explore problems associated with the non-independence of functional association data and test data. We find that a benchmark based on the Gene Ontology database, which, directly or indirectly, incorporates information from other databases, may considerably overestimate the performance of algorithms exploiting functional association data for prediction
- …