Search CORE

10 research outputs found

Adapting Data Adaptive Methods for Small, but High Dimensional Omic Data: Applications to GWAS/EWAS and More

Author: Hubbard Alan E
Kherad Pajouh Sara
Smith Martyn T.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/10/2013
Field of study

Exploratory analysis of high dimensional omics data has received much attention since the explosion of high-throughput technology allows simultaneous screening of tens of thousands of characteristics (genomics, metabolomics, proteomics, adducts, etc., etc.). Part of this trend has been an increase in the dimension of exposure data in studies of environmental exposure and associated biomarkers. Though some of the general approaches, such as GWAS, are transferable, what has received less focus is 1) how to derive estimation of independent associations in the context of many competing causes, without resorting to a misspecified model, and 2) how to derive accurate small-sample inference when data adaptive techniques are used in this context. This paper focuses on semi-parametric variable importance analysis of high dimensional data sets of modest sample size (e.g., gene expression, mRNA, etc). Though the methodology we propose is generally applicable to similar situations, we present the method in the context of a study of miRNA expression for an environmental exposure. Specifically, the analysis is faced with not just a large number of comparisons, but also trying to tease out of association of the expression of miRNA with an exposure apart from confounds such as age, race, smoking conditions, BMI, etc. Our goal is to propose a method that is reasonably robust in small samples, but does not rely on misspecified (arbitrary) parametric assumptions, and thus will be based on data-adaptive methods. The methodology proposed is we believe a powerful combination of existing semi-parametric statistical methods and theory, as well as a simple framework for use of commonly used empirical Bayes approaches to aid in small sample inference. Specifically, We propose using targeted maximum likelihood estimation (TMLE) for estimating variable importance measures along with a general adaptation of the commonly used Limma approach, which relies on specification of the so-called influence curve of the proposed estimator. The result is a machine-based approach that can estimate independent associations in high dimensional data, but protects against the unreliability of small-sample inference that can result when using data adaptive estimation in relatively small samples

Collection Of Biostatistics Research Archive

Statistical Inference for Data Adaptive Target Parameters

Author: Hubbard Alan E
Pajouh Sara Kherad
van der Laan Mark J.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 19/06/2013
Field of study

Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in estimation-sample (one of the V subsamples) and corresponding complementary parameter-generating sample that is used to generate a target parameter. For each of the V parameter-generating samples, we apply an algorithm that maps the sample in a target parameter mapping which represent the statistical target parameter generated by that parameter-generating sample. We define our sample-split data-adaptive statistical target pa- rameter as the average of these V -sample specific target parameters. We present an analogue estimator of this type of data adaptive target parameter and corresponding statistical inference. This general methodology for generating data adaptive target parameters while still providing valid statistical inference is demonstrated with a number of examples. These examples demonstrate that this methodology presents new opportunities for statistical learning from data that go beyond the usual requirement that the estimand is a priori defined in order to allow for proper statistical inference. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming “data-driven”, the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods - that is, the role of statisticians is being supplanted by computer scientist, deriving clever, yet typically ad hoc methods that “discover” the interesting patterns in data. The methodology presented in this paper can harness these methods, and now provide rigorous inference for the patterns, or target parameters suggested by such procedures. In this way, it returns exercises involving learning from data back within the proper domain of rigorous statistical inference. To suggest such potential, and to verify the predictions of the theory, simulation studies based upon algorithms that map the parameter- generating sample into the desired estimand are shown. However, the methodology generalizes to situations where even these algorithms are not prespecified

Collection Of Biostatistics Research Archive

Permutation tests for experimental designs, with extension to simultaneous EEG signal analysis

Author: Kherad Pajouh Sara
Publication venue: Université de Genève
Publication date: 01/01/2011
Field of study

Les tests de permutation (ou tests de randomisation), forment une classe de tests non paramétriques, et sont donc appropriés pour tester les hypothèses lorsque les postulats paramétriques ne sont pas satisfaits. Dans cette thèse, nous proposons de nouveaux tests de permutation, qui peuvent être utilisés pour analyser des plans d'expériences complexes ainsi que dans des protocoles expérimentaux avec des données de type EEG (Electroencéphalogramme). Plus précisément, dans la première partie de cette thèse (chapitre 2), nous avons développé un test exact de permutation pour les ANOVA à effets fixes ou factorielles, c'est à dire pour les ANOVA avec un seul terme d'erreur. Pour y parvenir, nous calculons d'abord les "résidus du modèle réduit", qui éliminent tous les effets fixes à part celui qui est testé. Dans la deuxième phase de la thèse (chapitre 3), nous nous sommes concentrés sur des designs plus complexes et nous avons adapté ce test de permutation pour les modèles mixtes et les ANOVA à mesures répétées. Nous introduisons un test de permutation approximatif basé sur les résidus du modèle réduit pour les mesures répétées et les modèles mixtes. Dans la troisième partie de la thése (chapitre 4), nous avons étendu les tests de permutation proposés aux cas de données avec plus d'une dimension tels que les signaux EEG (Electroencéphalogramme) ou ERP (Event Related Potential), que nous appellerons des signaux. Dans la grande majorité des expériences en psychologie ou en neurosciences utilisant ces techniques, le plan d'expérience est complexe, avec plusieurs facteurs inter- et/ou intra-sujets. Nous avons adapté la permutation des résidus dans le cadre du modèle réduit pour les signaux dans ces types de plan, c'est à dire pour les ANOVA factorielles, les ANOVA `a mesures répétées et les modèles mixtes

Archive ouverte UNIGE

An exact permutation method for testing any effect in balanced and unbalanced fixed effect ANOVA

Author: Kherad-Pajouh Sara
Renaud Olivier
Publication venue
Publication date
Field of study

The ANOVA method and permutation tests, two heritages of Fisher, have been extensively studied. Several permutation strategies have been proposed by others to obtain a distribution-free test for factors in a fixed effect ANOVA (i.e.,Â single error term ANOVA). The resulting tests are either approximate or exact. However, there exists no universal exact permutation test which can be applied to an arbitrary design to test a desired factor. An exact permutation strategy applicable to fixed effect analysis of variance is presented. The proposed method can be used to test any factor, even in the presence of higher-order interactions. In addition, the method has the advantage of being applicable in unbalanced designs (all-cell-filled), which is a very common situation in practice, and it is the first method with this capability. Simulation studies show that the proposed method has an actual level which stays remarkably close to the nominal level, and its power is always competitive. This is the case even with very small datasets, strongly unbalanced designs and non-Gaussian errors. No other competitor show such an enviable behavior.ANOVA Experimental design Non-parametric methods Permutation test

Research Papers in Economics

An exact permutation method for testing any effect in balanced and unbalanced fixed effect ANOVA

Author: Kherad Pajouh Sara
Renaud Olivier
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

The ANOVA method and permutation tests, two heritages of Fisher, have been extensively studied. Several permutation strategies have been proposed by others to obtain a distribution-free test for factors in a fixed effect ANOVA (i.e., single error term ANOVA). The resulting tests are either approximate or exact. However, there exists no universal exact permutation test which can be applied to an arbitrary design to test a desired factor. An exact permutation strategy applicable to fixed effect analysis of variance is presented. The proposed method can be used to test any factor, even in the presence of higher-order interactions. In addition, the method has the advantage of being applicable in unbalanced designs (all-cell-filled), which is a very common situation in practice, and it is the first method with this capability. Simulation studies show that the proposed method has an actual level which stays remarkably close to the nominal level, and its power is always competitive. This is the case even with very small datasets, strongly unbalanced designs and non-Gaussian errors. No other competitor show such an enviable behavior

Archive ouverte UNIGE

Recommended from our members

Statistical Inference for Data Adaptive Target Parameters.

Author: Hubbard Alan E
Kherad-Pajouh Sara
van der Laan Mark J
Publication venue: eScholarship, University of California
Publication date: 01/05/2016
Field of study

Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in an estimation sample (one of the V subsamples) and corresponding complementary parameter-generating sample. For each of the V parameter-generating samples, we apply an algorithm that maps the sample to a statistical target parameter. We define our sample-split data adaptive statistical target parameter as the average of these V-sample specific target parameters. We present an estimator (and corresponding central limit theorem) of this type of data adaptive target parameter. This general methodology for generating data adaptive target parameters is demonstrated with a number of practical examples that highlight new opportunities for statistical learning from data. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming "data-driven", the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods. To suggest such potential, and to verify the predictions of the theory, extensive simulation studies, along with a data analysis based on adaptively determined intervention rules are shown and give insight into how to structure such an approach. The results show that the data adaptive target parameter approach provides a general framework and resulting methodology for data-driven science

eScholarship - University of California

Recommended from our members

Statistical Inference for Data Adaptive Target Parameters.

Author: Hubbard Alan E
Kherad-Pajouh Sara
van der Laan Mark J
Publication venue: eScholarship, University of California
Publication date: 01/05/2016
Field of study

Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in an estimation sample (one of the V subsamples) and corresponding complementary parameter-generating sample. For each of the V parameter-generating samples, we apply an algorithm that maps the sample to a statistical target parameter. We define our sample-split data adaptive statistical target parameter as the average of these V-sample specific target parameters. We present an estimator (and corresponding central limit theorem) of this type of data adaptive target parameter. This general methodology for generating data adaptive target parameters is demonstrated with a number of practical examples that highlight new opportunities for statistical learning from data. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming "data-driven", the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods. To suggest such potential, and to verify the predictions of the theory, extensive simulation studies, along with a data analysis based on adaptively determined intervention rules are shown and give insight into how to structure such an approach. The results show that the data adaptive target parameter approach provides a general framework and resulting methodology for data-driven science

eScholarship - University of California

An exact permutation method for testing any effect in balanced and unbalanced fixed effect ANOVA

Author: Anderson
Basso
Brombin
Cardinal
David
Davison
Edgington
Fisher
Freedman
Gonzalez
Good
Good
Huh
Jung
Kirk
Manly
Olivier Renaud
Pesarin
Salmaso
Sara Kherad-Pajouh
Searle
Still
Ter Braak
Welch
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

A general permutation approach for analyzing repeated measures ANOVA and mixed-model designs

Author: A Still
A White
BC Jung
C Hirotsu
CA Field
D Basso
D Basso
D Basso
D Freedman
F Pesarin
F Pesarin
G Keppel
H Rouanet
H Sahai
J Cornfield
J Myers
M Anderson
Olivier Renaud
P Good
R Cardinal
R Fisher
R Kirk
Sara Kherad-Pajouh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The ANOVA method and permutation tests, two heritages of Fisher, have been extensively studied. Several permutation strategies have been proposed by others to obtain a distribution-free test for factors in a fixed effect ANOVA (i.e., single error term ANOVA). The resulting tests are either approximate or exact. However, there exists no universal exact permutation test which can be applied to an arbitrary design to test a desired factor. An exact permutation strategy applicable to fixed effect analysis of variance is presented. The proposed method can be used to test any factor, even in the presence of higher-order interactions. In addition, the method has the advantage of being applicable in unbalanced designs (all-cell-filled), which is a very common situation in practice, and it is the first method with this capability. Simulation studies show that the proposed method has an actual level which stays remarkably close to the nominal level, and its power is always competitive. This is the case even with very small datasets, strongly unbalanced designs and non-Gaussian errors. No other competitor show such an enviable behavior

Crossref

Archive ouverte UNIGE