Search CORE

4 research outputs found

Measuring association with recursive rank binning

Author: Oldford Wayne
Salahub Chris
Publication venue
Publication date: 14/11/2023
Field of study

Pairwise measures of dependence are a common tool to map data in the early stages of analysis with several modern examples based on maximized partitions of the pairwise sample space. Following a short survey of modern measures of dependence, we introduce a new measure which recursively splits the ranks of a pair of variables to partition the sample space and computes the

\chi^2

statistic on the resulting bins. Splitting logic is detailed for splits maximizing a score function and randomly selected splits. Simulations indicate that random splitting produces a statistic conservatively approximated by the

\chi^2

distribution without a loss of power to detect numerous different data patterns compared to maximized binning. Though it seems to add no power to detect dependence, maximized recursive binning is shown to produce a natural visualization of the data and the measure. Applying maximized recursive rank binning to S&P 500 constituent data suggests the automatic detection of tail dependence.Comment: 59 pages, 22 figure

arXiv.org e-Print Archive

Explorations in Pairwise Measures of Dependence and Pooled Significance

Author: Salahub Chris
Publication venue: 'University of Waterloo'
Publication date: 16/01/2024
Field of study

In the exploration of data sets with many variables, the search for interesting pairs is often the first step of analysis. This search builds a road map of the entirety of data before looking at its details, and can provide indispensable inspiration for deeper inves- tigation. Challenges are present, however, in adjusting results to address the multiple testing problem and choosing a measure with sufficient generality to detect many forms of dependence. This work proposes the measurement of statistical dependence by recursive binning of marginal ranks as a flexible measure of dependence. Simulation studies are used to characterize the null distribution and demonstrate the method’s sensitivity to different data patterns. By splitting bins randomly, the χ2 statistic has a null distribution conservatively approximated by the χ2 distribution seemingly without a loss of power compared to maximized splitting rules, which has an inflated statistic value. The method is demonstrated on real S&P 500 constituent data. To adjust for multiple testing, a new framework and coefficient are devised with appropriate proofs for analyzing pooled p-values based on their tendency to detect concentrated or diffuse evidence. This motivates a pooled p-value based on the χ2 quantile function as a way to adjust for multiple testing while controlling the family-wise error rate and fine-tuning for the evidence pattern of interest. Simulation studies suggest this method is similarly powerful to the uniformly most powerful method while being more robust to mis-specification. Both the recursive binning measurement of association and the χ2 pooled p-value are then demonstrated for genetic data after a tutorial introducing the relevant genetic concepts. A method of moments adjustment of the χ2 pooled p-value to account for correlation between tests is introduced and used with genomic and phenomic data from mice to identify regions of interest. The use of pooled p-values to combine parameter estimates in meta-analysis is also explored, establishing the concepts of evidential intervals and demonstrating their behaviour on simulated data

University of Waterloo's Institutional Repository

Representações euclidianas de dados : uma abordagem para variáveis heterogéneas

Author: Dória Isabel Maria Tudela Reimão Pinto de França, 1952-
Publication venue
Publication date: 01/01/2008
Field of study

Tese de doutoramento, Medicina (Biomatemática), Universidade de Lisboa, Faculdade de Medicina, 2009Disponível no document

Universidade de Lisboa: Repositório.UL

A measure of association for complex data

Author: Huh Moon Yul
Lee Seung-Chun
Publication venue
Publication date
Field of study

Research Papers in Economics