Stability Selection and Consensus Clustering in R: The R Package sharp

Abstract

The R package sharp (Stability-enHanced Approaches using Resampling Procedures) provides an integrated framework for stability-enhanced variable selection, graphical modeling and clustering. In stability selection, a feature selection algorithm is combined with a resampling technique to estimate feature selection probabilities. Features with selection proportions above a threshold are considered stably selected. Similarly, a clustering algorithm is applied on multiple subsamples of items to compute co-membership proportions in consensus clustering. The consensus clusters are obtained by clustering using comembership proportions as a measure of similarity. We calibrate the hyper-parameters of stability selection (or consensus clustering) jointly by maximizing a consensus score calculated under the null hypothesis of equiprobability of selection (or co-membership), which characterizes instability. The package offers flexibility in the modeling, includes diagnostic and visualization tools, and allows for parallelization

Similar works

This paper was published in Journal of Statistical Software.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: https://creativecommons.org/licenses/by/4.0