684,321 research outputs found
Robust rank correlation based screening
Independence screening is a variable selection method that uses a ranking
criterion to select significant variables, particularly for statistical models
with nonpolynomial dimensionality or "large p, small n" paradigms when p can be
as large as an exponential of the sample size n. In this paper we propose a
robust rank correlation screening (RRCS) method to deal with ultra-high
dimensional data. The new procedure is based on the Kendall \tau correlation
coefficient between response and predictor variables rather than the Pearson
correlation of existing methods. The new method has four desirable features
compared with existing independence screening methods. First, the sure
independence screening property can hold only under the existence of a second
order moment of predictor variables, rather than exponential tails or
alikeness, even when the number of predictor variables grows as fast as
exponentially of the sample size. Second, it can be used to deal with
semiparametric models such as transformation regression models and single-index
models under monotonic constraint to the link function without involving
nonparametric estimation even when there are nonparametric functions in the
models. Third, the procedure can be largely used against outliers and influence
points in the observations. Last, the use of indicator functions in rank
correlation screening greatly simplifies the theoretical derivation due to the
boundedness of the resulting statistics, compared with previous studies on
variable screening. Simulations are carried out for comparisons with existing
methods and a real data example is analyzed.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1024 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org). arXiv admin note: text overlap with
arXiv:0903.525
Efficient Rank Reduction of Correlation Matrices
Geometric optimisation algorithms are developed that efficiently find the
nearest low-rank correlation matrix. We show, in numerical tests, that our
methods compare favourably to the existing methods in the literature. The
connection with the Lagrange multiplier method is established, along with an
identification of whether a local minimum is a global minimum. An additional
benefit of the geometric approach is that any weighted norm can be applied. The
problem of finding the nearest low-rank correlation matrix occurs as part of
the calibration of multi-factor interest rate market models to correlation.Comment: First version: 20 pages, 4 figures Second version [changed content]:
21 pages, 6 figure
Correlation Clustering with Low-Rank Matrices
Correlation clustering is a technique for aggregating data based on
qualitative information about which pairs of objects are labeled 'similar' or
'dissimilar.' Because the optimization problem is NP-hard, much of the previous
literature focuses on finding approximation algorithms. In this paper we
explore how to solve the correlation clustering objective exactly when the data
to be clustered can be represented by a low-rank matrix. We prove in particular
that correlation clustering can be solved in polynomial time when the
underlying matrix is positive semidefinite with small constant rank, but that
the task remains NP-hard in the presence of even one negative eigenvalue. Based
on our theoretical results, we develop an algorithm for efficiently "solving"
low-rank positive semidefinite correlation clustering by employing a procedure
for zonotope vertex enumeration. We demonstrate the effectiveness and speed of
our algorithm by using it to solve several clustering problems on both
synthetic and real-world data
AN EXHAUSTIVE COEFFICIENT OF RANK CORRELATION
Rank association is a fundamental tool for expressing dependence in cases in which data are arranged in order. Measures of rank correlation have been accumulated in several contexts for more than a century and we were able to cite more than thirty of these coefficients, from simple ones to relatively complicated definitions invoking one or more systems of weights. However, only a few of these can actually be considered to be admissible substitutes for Pearson’s correlation. The main drawback with the vast majority of coefficients is their “resistance-tochange” which appears to be of limited value for the purposes of rank comparisons that are intrinsically robust. In this article, a new nonparametric correlation coefficient is defined that is based on the principle of maximization of a ratio of two ranks. In comparing it with existing rank correlations, it was found to have extremely high sensitivity to permutation patterns. We have illustrated the potential improvement that our index can provide in economic contexts by comparing published results with those obtained through the use of this new index. The success that we have had suggests that our index may have important applications wherever the discriminatory power of the rank correlation coefficient should be particularly strong.Ordinal data, Nonparametric agreement, Economic applications
COMPARING THE EFFECTIVENESS OF RANK CORRELATION STATISTICS
Rank correlation is a fundamental tool to express dependence in cases in which the data are arranged in order. There are, by contrast, circumstances where the ordinal association is of a nonlinear type. In this paper we investigate the effectiveness of several measures of rank correlation. These measures have been divided into three classes: conventional rank correlations, weighted rank correlations, correlations of scores. Our findings suggest that none is systematically better than the other in all circumstances. However, a simply weighted version of the Kendall rank correlation coefficient provides plausible answers to many special situations where intercategory distances could not be considered on the same basis.Ordinal Data, Nonlinear Association, Weighted Rank Correlation
Detecting genuine multipartite correlations in terms of the rank of coefficient matrix
We propose a method to detect genuine quantum correlation for arbitrary
quantum state in terms of the rank of coefficient matrices associated with the
pure state. We then derive a necessary and sufficient condition for a quantum
state to possess genuine correlation, namely that all corresponding coefficient
matrices have rank larger than one. We demonstrate an approach to decompose the
genuine quantum correlated state with high rank coefficient matrix into the
form of product states with no genuine quantum correlation for pure state.Comment: 5 pages, 1 figure. Comments are welcom
PageRank and rank-reversal dependence on the damping factor
PageRank (PR) is an algorithm originally developed by Google to evaluate the
importance of web pages. Considering how deeply rooted Google's PR algorithm is
to gathering relevant information or to the success of modern businesses, the
question of rank-stability and choice of the damping factor (a parameter in the
algorithm) is clearly important. We investigate PR as a function of the damping
factor d on a network obtained from a domain of the World Wide Web, finding
that rank-reversal happens frequently over a broad range of PR (and of d). We
use three different correlation measures, Pearson, Spearman, and Kendall, to
study rank-reversal as d changes, and show that the correlation of PR vectors
drops rapidly as d changes from its frequently cited value, .
Rank-reversal is also observed by measuring the Spearman and Kendall rank
correlation, which evaluate relative ranks rather than absolute PR.
Rank-reversal happens not only in directed networks containing rank-sinks but
also in a single strongly connected component, which by definition does not
contain any sinks. We relate rank-reversals to rank-pockets and bottlenecks in
the directed network structure. For the network studied, the relative rank is
more stable by our measures around than at .Comment: 14 pages, 9 figure
Rank Reduction of Correlation Matrices by Majorization
A novel algorithm is developed for the problem of finding a low-rank correlation matrix nearest to a given correlation matrix. The algorithm is based on majorization and, therefore, it is globally convergent. The algorithm is computationally efficient, is straightforward to implement, and can handle arbitrary weights on the entries of the correlation matrix. A simulation study suggests that majorization compares favourably with competing approaches in terms of the quality of the solution within a fixed computational time. The problem of rank reduction of correlation matrices occurs when pricing a derivative dependent on a large number of assets, where the asset prices are modelled as correlated log-normal processes. Mainly, such an application concerns interest rates.rank, correlation matrix, majorization, lognormal price processes
- …