1 research outputs found
A Rademacher Complexity Based Method fo rControlling Power and Confidence Level in Adaptive Statistical Analysis
While standard statistical inference techniques and machine learning
generalization bounds assume that tests are run on data selected independently
of the hypotheses, practical data analysis and machine learning are usually
iterative and adaptive processes where the same holdout data is often used for
testing a sequence of hypotheses (or models), which may each depend on the
outcome of the previous tests on the same data. In this work, we present
RadaBound a rigorous, efficient and practical procedure for controlling the
generalization error when using a holdout sample for multiple adaptive testing.
Our solution is based on a new application of the Rademacher Complexity
generalization bounds, adapted to dependent tests. We demonstrate the
statistical power and practicality of our method through extensive simulations
and comparisons to alternative approaches