Search CORE

76 research outputs found

The cost of large numbers of hypothesis tests on power, effect size and sample size

Author: A Ray
CR Genovese
DJ Dow
DR Nyholt
J Li
JA Todd
JD Storey
JM Bland
JM Cheverud
JP Shaffer
KN Conneely
L C Lazzeroni
LC Lazzeroni
M Baker
P Billingsley
PH Westfall
PJ Veazie
R Bender
RA Fisher
RG Miller
S Stigler
SP Selwood
WYS Wang
Y Benjamini
Publication venue: Nature Publishing Group
Publication date
Field of study

Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing

Crossref

PubMed Central

Validation of pooled genotyping on the Affymetrix 500 k and SNP6.0 genotyping platforms using the polynomial-based probe-specific correction

Author: A Baum
AJ Brookes
CL Simpson
Consortium TWTCC
D Moore
DW Craig
E Meaburn
E Meaburn
Fook Tim Chew
G Kirov
H-C Yang
I Zaharieva
J Brohede
JN Hirschhorn
L Butcher
LM Butcher
MI Asher
Ramani Anantharaman
S Macgregor
S Macgregor
S Shifman
S Steer
S Wilkening
SJ Docherty
SL Hellard
Team RDC
WJ Cheng Hu
WYS Wang
Y Zuo
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

10.1186/1471-2156-10-82BMC Genetics10-BGME

Crossref

Springer - Publisher Connector

PubMed Central

ScholarBank@NUS

Genome-wide association studies using an adaptive two-stage analysis for a case-control design

Author: D Hwang
Dawn Waterworth
G Zheng
K Song
K Van Steen
Kijoung Song
Qing Lu
RC Elston
RJ Klein
Robert C Elston
WYS Wang
Xiwu Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A First Generation Microsatellite- and SNP-Based Linkage Map of Jatropha

Author: A Rafalski
Chengxin Yi
Chun Ming Wang
CM Wang
CM Wang
CR Carvalho
CX Yi
D Fairless
DB Goldstein
Debashish Bhattacharya
EP Guimarães
ES Lander
F Li
Fei Sun
Felicia Feng
Gen Hua Yue
GH Yue
Grace Lin
H Joachim
J Ye
JCM Dekkers
JH Xia
JR Brown
K Meksem
K Openshaw
Keyu Gu
KY Gu
Lei Li
Loong Chueng Lo
M Hayashi
MW Ganal
P Green
P Panjabi
Peng Liu
PH Dear
QBL Sun
S Jain
S Sato
S Sato
S Shah
SC Schuster
SF Altschul
Suying Cao
TY Hwang
WYS Wang
Xiaokun Liu
XS Hu
Y Harushima
Y Ren
Y Zhang
Yan Hong
Z Xia
Zhongchao Yin
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Jatropha curcas is a potential plant species for biodiesel production. However, its seed yield is too low for profitable production of biodiesel. To improve the productivity, genetic improvement through breeding is essential. A linkage map is an important component in molecular breeding. We established a first-generation linkage map using a mapping panel containing two backcross populations with 93 progeny. We mapped 506 markers (216 microsatellites and 290 SNPs from ESTs) onto 11 linkage groups. The total length of the map was 1440.9 cM with an average marker space of 2.8 cM. Blasting of 222 Jatropha ESTs containing polymorphic SSR or SNP markers against EST-databases revealed that 91.0%, 86.5% and 79.2% of Jatropha ESTs were homologous to counterparts in castor bean, poplar and Arabidopsis respectively. Mapping 192 orthologous markers to the assembled whole genome sequence of Arabidopsis thaliana identified 38 syntenic blocks and revealed that small linkage blocks were well conserved, but often shuffled. The first generation linkage map and the data of comparative mapping could lay a solid foundation for QTL mapping of agronomic traits, marker-assisted breeding and cloning genes responsible for phenotypic variation

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

phenosim - A software to simulate phenotypes for testing in genome-wide association studies

Author: A Platt
B Peng
BE Stranger
BW Lambert
G Ewing
G Hellenthal
G van Rossum
GR Abecasis
GR Abecasis
HM Kang
HM Kang
Inka Gawenda
Karl J Schmid
L Liang
La Hindorff
M Chadeau-Hyam
M Nordborg
M Nordborg
PJ Bradbury
RR Hudson
RR Hudson
S Atwell
S Besenbacher
S Kim
S Neuenschwander
S Purcell
T Mailund
T Mailund
Torsten Günther
WYS Wang
Y Li
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background There is a great interest in understanding the genetic architecture of complex traits in natural populations. Genome-wide association studies (GWAS) are becoming routine in human, animal and plant genetics to understand the connection between naturally occurring genotypic and phenotypic variation. Coalescent simulations are commonly used in population genetics to simulate genotypes under different parameters and demographic models. Results Here, we present <monospace>phenosim</monospace>, a software to add a phenotype to genotypes generated in time-efficient coalescent simulations. Both qualitative and quantitative phenotypes can be generated and it is possible to partition phenotypic variation between additive effects and epistatic interactions between causal variants. The output formats of <monospace>phenosim</monospace> are directly usable as input for different GWAS tools. The applicability of <monospace>phenosim</monospace> is shown by simulating a genome-wide association study in <it>Arabidopsis thaliana</it>. Conclusions By using the coalescent approach to generate genotypes and <monospace>phenosim</monospace> to add phenotypes, the data sets can be used to assess the influence of various factors such as demography, genetic architecture or selection on the statistical power of association methods to detect causal genetic variants under a wide variety of population genetic scenarios. <monospace>phenosim</monospace> is freely available from the authors' website <url>http://evoplant.uni-hohenheim.de</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Gene-Centric Characteristics of Genome-Wide Association Studies

Author: A Gonzalez-Neira
A Siepel
A Woolfe
A Yoshida
AK Bielinsky
Changzheng Dong
CJ Willer
CS Carlson
D Fredman
DC King
DL Nicolae
E Jorgenson
E Tantoso
EA Grice
FS Collins
H Matsuzaki
I Pe'er
J Akey
J Hirschhorn
J Jaruzelska
JC Barrett
JR Hughes
JT Shin
K Auro
Katrina Gwinn
KD Dahlquist
KG Becker
KL Gunderson
L Giot
M Ashburner
M De Gobbi
M Kanehisa
N Risch
PD Stenson
Peilin Jia
R Saxena
R Sladek
RL Proia
S Wiltshire
TG Lesnick
U Stelzl
V Steinthorsdottir
W Huang
Wei Huang
WYS Wang
Y Altuvia
Ying Wang
Yixue Li
Ziliang Qian
Publication venue: Public Library of Science
Publication date: 05/12/2007
Field of study

BACKGROUND: The high-throughput genotyping chips have contributed greatly to genome-wide association (GWA) studies to identify novel disease susceptibility single nucleotide polymorphisms (SNPs). The high-density chips are designed using two different SNP selection approaches, the direct gene-centric approach, and the indirect quasi-random SNPs or linkage disequilibrium (LD)-based tagSNPs approaches. Although all these approaches can provide high genome coverage and ascertain variants in genes, it is not clear to which extent these approaches could capture the common genic variants. It is also important to characterize and compare the differences between these approaches. METHODOLOGY/PRINCIPAL FINDINGS: In our study, by using both the Phase II HapMap data and the disease variants extracted from OMIM, a gene-centric evaluation was first performed to evaluate the ability of the approaches in capturing the disease variants in Caucasian population. Then the distribution patterns of SNPs were also characterized in genic regions, evolutionarily conserved introns and nongenic regions, ontologies and pathways. The results show that, no mater which SNP selection approach is used, the current high-density SNP chips provide very high coverage in genic regions and can capture most of known common disease variants under HapMap frame. The results also show that the differences between the direct and the indirect approaches are relatively small. Both have similar SNP distribution patterns in these gene-centric characteristics. CONCLUSIONS/SIGNIFICANCE: This study suggests that the indirect approaches not only have the advantage of high coverage but also are useful for studies focusing on various functional SNPs either in genes or in the conserved regions that the direct approach supports. The study and the annotation of characteristics will be helpful for designing and analyzing GWA studies that aim to identify genetic risk factors involved in common diseases, especially variants in genes and conserved regions

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Disease-associated alleles in genome-wide association studies are enriched for derived low frequency alleles relative to HapMap and neutral expectations

Abstract Background Genome-wide association studies give insight into the genetic basis of common diseases. An open question is whether the allele frequency distributions and ancestral vs. derived states of disease-associated alleles differ from the rest of the genome. Characteristics of disease-associated alleles can be used to increase the yield of future studies. Methods The set of all common disease-associated alleles found in genome-wide association studies prior to January 2010 was analyzed and compared with HapMap and theoretical null expectations. In addition, allele frequency distributions of different disease classes were assessed. Ages of HapMap and disease-associated alleles were also estimated. Results The allele frequency distribution of HapMap alleles was qualitatively similar to neutral expectations. However, disease-associated alleles were more likely to be low frequency derived alleles relative to null expectations. 43.7% of disease-associated alleles were ancestral alleles. The mean frequency of disease-associated alleles was less than randomly chosen CEU HapMap alleles (0.394 vs. 0.610, after accounting for probability of detection). Similar patterns were observed for the subset of disease-associated alleles that have been verified in multiple studies. SNPs implicated in genome-wide association studies were enriched for young SNPs compared to randomly selected HapMap loci. Odds ratios of disease-associated alleles tended to be less than 1.5 and varied by frequency, confirming previous studies. Conclusions Alleles associated with genetic disease differ from randomly selected HapMap alleles and neutral expectations. The evolutionary history of alleles (frequency and ancestral vs. derived state) influences whether they are implicated in genome-wide assocation studies.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Genome-Wide Analysis Reveals a Complex Pattern of Genomic Imprinting in Mice

Author: A Burt
A Lewis
AJ Wood
Anne C. Ferguson-Smith
AR Isles
BE Hayward
BT Heijmans
C Dong
C Mantey
Charles Roseman
CK Chai
DJ De Koning
ES Lander
H Goodale
IM Morison
J Li
J Li
J Li
J Macarthur
James M. Cheverud
Jason B. Wolf
JM Cheverud
JM Cheverud
JM Cheverud
JM Itier
JN Hirschhorn
KS Kim
KW Broman
L Chen
L Qian
LL Li
LS Wilkinson
M Constancia
M Georges
MG Kramer
MS Bartolomei
NE Cockett
PP Luedi
R Hager
RD Nicholls
Reinmar Hager
T Hrbek
T Sakatani
TT Vaughn
W Reik
WYS Wang
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Parent-of-origin–dependent gene expression resulting from genomic imprinting plays an important role in modulating complex traits ranging from developmental processes to cognitive abilities and associated disorders. However, while gene-targeting techniques have allowed for the identification of imprinted loci, very little is known about the contribution of imprinting to quantitative variation in complex traits. Most studies, furthermore, assume a simple pattern of imprinting, resulting in either paternal or maternal gene expression; yet, more complex patterns of effects also exist. As a result, the distribution and number of different imprinting patterns across the genome remain largely unexplored. We address these unresolved issues using a genome-wide scan for imprinted quantitative trait loci (iQTL) affecting body weight and growth in mice using a novel three-generation design. We identified ten iQTL that display much more complex and diverse effect patterns than previously assumed, including four loci with effects similar to the callipyge mutation found in sheep. Three loci display a new phenotypic pattern that we refer to as bipolar dominance, where the two heterozygotes are different from each other while the two homozygotes are identical to each other. Our study furthermore detected a paternally expressed iQTL on Chromosome 7 in a region containing a known imprinting cluster with many paternally expressed genes. Surprisingly, the effects of the iQTL were mostly restricted to traits expressed after weaning. Our results imply that the quantitative effects of an imprinted allele at a locus depend both on its parent of origin and the allele it is paired with. Our findings also show that the imprinting pattern of a locus can be variable over ontogenetic time and, in contrast to current views, may often be stronger at later stages in life

Public Library of Science (PLOS)

OPUS

Crossref

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

The University of Manchester - Institutional Repository

The Complexity of Vascular and Non-Vascular Complications of Diabetes: The Hong Kong Diabetes Registry

Author: A Luk
AO Luk
AO Luk
AO Luk
AP Kong
AY Cheng
AY Wu
BM Brenner
C Zhang
CK Wong
CK Wong
EE Calle
GT Ko
H Ogawa
HC Gerstein
HH Parving
HH Parving
J Belch
JA Johnson
JC Chan
JCN Chan
JCN Chan
JCN Chan
JCN Chan
Juliana C. N. Chan
K Piwernetz
KH Yoon
L Baum
M Woodward
MKW Lo
NJ Morrish
O Warburg
P Gaede
PC Tong
PCY Tong
PCY Tong
Peter C. Y. Tong
RC Ma
RC Ma
Rebecca Wong
Ronald C. W. Ma
S Hernandez-Diaz
ST Tu
W Yang
Wingyee So
WY Leung
WY So
WY So
WY So
WY So
WYS Leung
X Yang
X Yang
X Yang
X Yang
X Yang
X Yang
X Yang
X Yang
X Yang
Xilin Yang
XL Yang
XL Yang
XY Song
Y Wang
Y Wang
Publication venue: Current Science Inc.
Publication date: 01/01/2011
Field of study

Diabetes is a complex disease characterized by chronic hyperglycemia and multiple phenotypes. In 1995, we used a doctor-nurse-clerk team and structured protocol to establish the Hong Kong Diabetes Registry in a quality improvement program. By 2009, we had accrued 2616 clinical events in 9588 Chinese type 2 diabetic patients with a follow-up duration of 6 years. The detailed phenotypes at enrollment and follow-up medications have allowed us to develop a series of risk equations to predict multiple endpoints with high sensitivity and specificity. In this prospective database, we were able to validate findings from clinical trials in real practice, confirm close links between cardiovascular and renal disease, and demonstrate the emerging importance of cancer as a leading cause of death. In addition to serving as a tool for risk stratification and quality assurance, ongoing data analysis of the registry also reveals secular changes in disease patterns and identifies unmet needs

Crossref

Springer - Publisher Connector

PubMed Central

TNF-α is involved in activating DNA fragmentation in skeletal muscle

Author: B Huppertz
C Drott
C García-Martínez
C Sidoti-de-Fraisse
CI Wang
DL Allen
DS Tews
DS Tews
DW Banner
DW Nixon
F Dworzak
G Itoh
H Ohta
HP Hohman
J Kajstura
J Ogasawara
J Rothe
JM Argilés
JM Argilés
JM Argilés
JM Argilés
JM Argilés
KA Krown
KB Harvey
L Balducci
L Dalla Libera
LM Obeid
LM Schwartz
M Llovera
M Llovera
M Sandri
M Tanaka
M Van Royen
MM Bradford
P Costelli
P Vandenabeele
P Workman
R Adams
RD Evans
S Dimmeler
S Warren
T Weiss
TR Downs
V Haridas
W Declercq
WD De Wys
ZG Liu
Publication venue: Nature Publishing Group
Publication date
Field of study

Intraperitoneal administration of 100 μg kg−1 (body weight) of tumour necrosis factor-α to rats for 8 consecutive days resulted in a significant decrease in protein content, which was concomitant with a reduction in DNA content. Interestingly, the protein/DNA ratio was unchanged in the skeletal muscle of the tumour necrosis factor-α-treated animals as compared with the non-treated controls. Analysis of muscle DNA fragmentation clearly showed enhanced laddering in the skeletal muscle of tumour necrosis factor-α-treated animals, suggesting an apoptotic phenomenon. In a different set of experiments, mice bearing a cachexia-inducing tumour (the Lewis lung carcinoma) showed an increase in muscle DNA fragmentation (9.8-fold) as compared with their non-tumour-bearing control counterparts as previously described. When gene-deficient mice for tumour necrosis factor-α receptor protein I were inoculated with Lewis lung carcinoma, they were also affected by DNA fragmentation; however the increase was only 2.1-fold. These results suggest that tumour necrosis factor-α partly mediates DNA fragmentation during experimental cancer-associated cachexia

Crossref

PubMed Central