21 research outputs found
Generalized additive and index models with shape constraints
We study generalised additive models, with shape restrictions (e.g.
monotonicity, convexity, concavity) imposed on each component of the additive
prediction function. We show that this framework facilitates a nonparametric
estimator of each additive component, obtained by maximising the likelihood.
The procedure is free of tuning parameters and under mild conditions is proved
to be uniformly consistent on compact intervals. More generally, our
methodology can be applied to generalised additive index models. Here again,
the procedure can be justified on theoretical grounds and, like the original
algorithm, possesses highly competitive finite-sample performance. Practical
utility is illustrated through the use of these methods in the analysis of two
real datasets. Our algorithms are publicly available in the \texttt{R} package
\textbf{scar}, short for \textbf{s}hape-\textbf{c}onstrained \textbf{a}dditive
\textbf{r}egression.Both authors are supported by the second author’s Engineering and Physical Sciences Research Fellowship EP/J017213/1.This is the final version of the article. It first appeared from Wiley via http://dx.doi.org/10.1111/rssb.1213
Variable selection with error control: Another look at stability selection
Stability Selection was recently introduced by Meinshausen and Buhlmann
(2010) as a very general technique designed to improve the performance of a
variable selection algorithm. It is based on aggregating the results of
applying a selection procedure to subsamples of the data. We introduce a
variant, called Complementary Pairs Stability Selection (CPSS), and derive
bounds both on the expected number of variables included by CPSS that have low
selection probability under the original procedure, and on the expected number
of high selection probability variables that are excluded. These results
require no (e.g. exchangeability) assumptions on the underlying model or on the
quality of the original selection procedure. Under reasonable shape
restrictions, the bounds can be further tightened, yielding improved error
control, and therefore increasing the applicability of the methodology.This is the accepted manuscript version. The final published version is available from Wiley at http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9868.2011.01034.x/abstract
Nonparametric independence testing via mutual information
We propose a test of independence of two multivariate random vectors, given a
sample from the underlying population. Our approach, which we call MINT, is
based on the estimation of mutual information, whose decomposition into joint
and marginal entropies facilitates the use of recently-developed efficient
entropy estimators derived from nearest neighbour distances. The proposed
critical values, which may be obtained from simulation (in the case where one
marginal is known) or resampling, guarantee that the test has nominal size, and
we provide local power analyses, uniformly over classes of densities whose
mutual information satisfies a lower bound. Our ideas may be extended to
provide a new goodness-of-fit tests of normal linear models based on assessing
the independence of our vector of covariates and an appropriately-defined
notion of an error vector. The theory is supported by numerical studies on both
simulated and real data.EPSRC
Leverhulme Trust
SIMS fun
Sparse principal component analysis via axis-aligned random projections
We introduce a new method for sparse principal component analysis, based on
the aggregation of eigenvector information from carefully-selected axis-aligned
random projections of the sample covariance matrix. Unlike most alternative
approaches, our algorithm is non-iterative, so is not vulnerable to a bad
choice of initialisation. We provide theoretical guarantees under which our
principal subspace estimator can attain the minimax optimal rate of convergence
in polynomial time. In addition, our theory provides a more refined
understanding of the statistical and computational trade-off in the problem of
sparse principal component estimation, revealing a subtle interplay between the
effective sample size and the number of random projections that are required to
achieve the minimax optimal rate. Numerical studies provide further insight
into the procedure and confirm its highly competitive finite-sample
performance.The research of the first and third authors was supported by an Engineering and Physical Sciences Research Council (EPSRC) grant EP/N014588/1 for the centre for Mathematical and Statistical Analysis of Multimodal Clinical Imaging. The second and third authors were supported by EPSRC Fellowship EP/J017213/1 and EP/P031447/1, and grant RG81761 from the Leverhulme Trust
Recommended from our members
The conditional permutation test for independence while controlling for confounders
We propose a general new method, the conditional permutation test, for
testing the conditional independence of variables and given a
potentially high-dimensional random vector that may contain confounding
factors. The proposed test permutes entries of non-uniformly, so as to
respect the existing dependence between and and thus account for the
presence of these confounders. Like the conditional randomization test of
Cand\`es et al. (2018), our test relies on the availability of an approximation
to the distribution of . While Cand\`es et al. (2018)'s test uses
this estimate to draw new values, for our test we use this approximation to
design an appropriate non-uniform distribution on permutations of the
values already seen in the true data. We provide an efficient Markov Chain
Monte Carlo sampler for the implementation of our method, and establish bounds
on the Type I error in terms of the error in the approximation of the
conditional distribution of , finding that, for the worst case test
statistic, the inflation in Type I error of the conditional permutation test is
no larger than that of the conditional randomization test. We validate these
theoretical results with experiments on simulated data and on the Capital
Bikeshare data set
High-dimensional change point estimation via sparse projection
Changepoints are a very common feature of Big Data that arrive in the form of a data stream. In this paper, we study high-dimensional time series in which, at certain time points, the mean structure changes in a sparse subset of the coordinates. The challenge is to borrow strength across the coordinates in order to detect smaller changes than could be observed in any individual component series. We propose a two-stage procedure called 'inspect' for estimation of the changepoints: first, we argue that a good projection direction can be obtained as the leading left singular vector of the matrix that solves a convex optimisation problem derived from the CUSUM transformation of the time series. We then apply an existing univariate changepoint estimation algorithm to the projected series. Our theory provides strong guarantees on both the number of estimated changepoints and the rates of convergence of their locations, and our numerical studies validate its highly competitive empirical performance for a wide range of data generating mechanisms. Software implementing the methodology is available in the R package 'InspectChangepoint'
Isotonic regression in general dimensions
We study the least squares regression function estimator over the class of real-valued functions on that are increasing in each coordinate. For uniformly bounded signals and with a fixed, cubic lattice design, we establish that the estimator achieves the minimax rate of order in the empirical loss, up to poly-logarithmic factors. Further, we prove a sharp oracle inequality, which reveals in particular that when the true regression function is piecewise constant on hyperrectangles, the least squares estimator enjoys a faster, adaptive rate of convergence of , again up to poly-logarithmic factors. Previous results are confined to the case . Finally, we establish corresponding bounds (which are new even in the case ) in the more challenging random design setting. There are two surprising features of these results: first, they demonstrate that it is possible for a global empirical risk minimisation procedure to be rate optimal up to poly-logarithmic factors even when the corresponding entropy integral for the function class diverges rapidly; second, they indicate that the adaptation rate for shape-constrained estimators can be strictly worse than the parametric rate.The research of the first author is supported in part by NSF Grant DMS-1566514. The research of the second and fourth authors is supported by EPSRC fellowship EP/J017213/1 and a grant from the Leverhulme Trust RG81761
Comments on: High-dimensional simultaneous inference with the bootstrap
We congratulate the authors on their stimulating contribution to the burgeoning high-dimensional inference literature. The bootstrap offers such an attractive methodology in these settings, but it is well-known that its naive application in the context of shrinkage/superefficiency is fraught with danger (e.g. Samworth, 2003; Chatterjee and Lahiri, 2011). The authors show how these perils can be elegantly sidestepped by working with de-biased, or de-sparsified, versions of estimators.EPSRC (EP/J017213/1), Leverhulme Trust (PLP-2014-353
Ensemble of a subset of kNN classifiers
Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines
A multiple myeloma classification system that associates normal B-cell subset phenotypes with prognosis.
Despite the recent progress in treatment of multiple myeloma (MM), it is still an incurable malignant disease, and we are therefore in need of new risk stratification tools that can help us to understand the disease and optimize therapy. Here we propose a new subtyping of myeloma plasma cells (PCs) from diagnostic samples, assigned by normal B-cell subset associated gene signatures (BAGS). For this purpose, we combined fluorescence-activated cell sorting and gene expression profiles from normal bone marrow (BM) Pre-BI, Pre-BII, immature, naïve, memory, and PC subsets to generate BAGS for assignment of normal BM subtypes in diagnostic samples. The impact of the subtypes was analyzed in 8 available data sets from 1772 patients' myeloma PC samples. The resulting tumor assignments in available clinical data sets exhibited similar BAGS subtype frequencies in 4 cohorts from de novo MM patients across 1296 individual cases. The BAGS subtypes were significantly associated with progression-free and overall survival in a meta-analysis of 916 patients from 3 prospective clinical trials. The major impact was observed within the Pre-BII and memory subtypes, which had a significantly inferior prognosis compared with other subtypes. A multiple Cox proportional hazard analysis documented that BAGS subtypes added significant, independent prognostic information to the translocations and cyclin D classification. BAGS subtype analysis of patient cases identified transcriptional differences, including a number of differentially spliced genes. We identified subtype differences in myeloma at diagnosis, with prognostic impact and predictive potential, supporting an acquired B-cell trait and phenotypic plasticity as a pathogenetic hallmark of MM