Search CORE

2,159 research outputs found

The pls Package: Principal Component and Partial Least Squares Regression in R

Author: Björn-Helge Mevik
Ron Wehrens
Publication venue
Publication date
Field of study

The pls package implements principal component regression (PCR) and partial least squares regression (PLSR) in R (R Development Core Team 2006b), and is freely available from the Comprehensive R Archive Network (CRAN), licensed under the GNU General Public License (GPL). The user interface is modelled after the traditional formula interface, as exemplified by lm. This was done so that people used to R would not have to learn yet another interface, and also because we believe the formula interface is a good way of working interactively with models. It thus has methods for generic functions like predict, update and coef. It also has more specialised functions like scores, loadings and RMSEP, and a exible crossvalidation system. Visual inspection and assessment is important in chemometrics, and the pls package has a number of plot functions for plotting scores, loadings, predictions, coefficients and RMSEP estimates. The package implements PCR and several algorithms for PLSR. The design is modular, so that it should be easy to use the underlying algorithms in other functions. It is our hope that the package will serve well both for interactive data analysis and as a building block for other functions or packages using PLSR or PCR. We will here describe the package and how it is used for data analysis, as well as how it can be used as a part of other packages. Also included is a section about formulas and data frames, for people not used to the R modelling idioms.

Research Papers in Economics

Self- and Super-organizing Maps in R: The kohonen Package

Author: Lutgarde M. C. Buydens
Ron Wehrens
Publication venue
Publication date
Field of study

In this age of ever-increasing data set sizes, especially in the natural sciences, visualisation becomes more and more important. Self-organizing maps have many features that make them attractive in this respect: they do not rely on distributional assumptions, can handle huge data sets with ease, and have shown their worth in a large number of applications. In this paper, we highlight the kohonen package for R, which implements self-organizing maps as well as some extensions for supervised pattern recognition and data fusion.

Research Papers in Economics

A comparison of computational approaches for maximum likelihood estimation of the Dirichlet parameters on high-dimensional data

Author: Giordan Marco
Wehrens Ron
Publication venue
Publication date: 01/01/2015
Field of study

Likelihood estimates of the Dirichlet distribution parameters can be obtained only through numerical algorithms. Such algorithms can provide estimates outside the correct range for the parameters and/or can require a large amount of iterations to reach convergence. These problems can be aggravated if good starting values are not provided. In this paper we discuss several approaches that can partially avoid these problems providing a good trade-off between efficiency and stability. The performances of these approaches are compared on high-dimensional real and simulated data

Diposit Digital de Documents de la UAB

A comparison of computational approaches for maximum likelihood estimation of the Dirichlet parameters on high-dimensional data

Author: Giordan Marco
Wehrens Ron
Publication venue: Institut d'Estadística de Catalunya
Publication date: 01/01/2015
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Diposit Digital de Documents de la UAB

Meta-Statistics for Variable Selection: The R Package BioMark

Author: Pietro Franceschi
Ron Wehrens
Publication venue: Foundation for Open Access Statistics
Publication date: 01/01/2012
Field of study

Biomarker identification is an ever more important topic in the life sciences. With the advent of measurement methodologies based on microarrays and mass spectrometry, thousands of variables are routinely being measured on complex biological samples. Often, the question is what makes two groups of samples different. Classical hypothesis testing suffers from the multiple testing problem; however, correcting for this often leads to a lack of power. In addition, choosing α cutoff levels remains somewhat arbitrary. Also in a regression context, a model depending on few but relevant variables will be more accurate and precise, and easier to interpret biologically.We propose an R package, BioMark, implementing two meta-statistics for variable selection. The first, higher criticism, presents a data-dependent selection threshold for significance, instead of a cookbook value of α = 0.05. It is applicable in all cases where two groups are compared. The second, stability selection, is more general, and can also be applied in a regression context. This approach uses repeated subsampling of the data in order to assess the variability of the model coefficients and selects those that remain consistently important. It is shown using experimental spike-in data from the field of metabolomics that both approaches work well with real data. BioMark also contains functionality for simulating data with specific characteristics for algorithm development and testing

Archivio istituzionale della ricerca - Fondazione Edmund Mach

Directory of Open Access Journals

Journal of Statistical Software

Self- and Super-organizing Maps in R: The kohonen Package

Author: Lutgarde M. C. Buydens
Ron Wehrens
Publication venue: Foundation for Open Access Statistics
Publication date: 01/10/2007
Field of study

Directory of Open Access Journals

Journal of Statistical Software

Who is ‘in’ and who is ‘out’? Participation of older persons in health research and the interplay between capital, habitus and field

Author: Oldenhof L.E. (Lieke)
Wehrens R.L.E. (Rik)
Publication venue: 'Informa UK Limited'
Publication date: 22/02/2018
Field of study

Inclusion and exclusion processes in community engagement do not take place in a vacuum, but are embedded in social, political and institutional contexts. T

EUR Research Repository

Erasmus University Digital Repository