Search CORE

157 research outputs found

Power and type I error rate of false discovery rate approaches in genome-wide association studies

Author: Chazaro Irmarie
Cui Jing
Cupples L Adrienne
Demissie Serkalem
Yang Qiong
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

In genome-wide genetic studies with a large number of markers, balancing the type I error rate and power is a challenging issue. Recently proposed false discovery rate (FDR) approaches are promising solutions to this problem. Using the 100 simulated datasets of a genome-wide marker map spaced about 3 cM and phenotypes from the Genetic Analysis Workshop 14, we studied the type I error rate and power of Storey's FDR approach, and compared it to the traditional Bonferroni procedure. We confirmed that Storey's FDR approach had a strong control of FDR. We found that Storey's FDR approach only provided weak control of family-wise error rate (FWER). For these simulated datasets, Storey's FDR approach only had slightly higher power than the Bonferroni procedure. In conclusion, Storey's FDR approach is more powerful than the Bonferroni procedure if strong control of FDR or weak control of FWER is desired. Storey's FDR approach has little power advantage over the Bonferroni procedure if there is low linkage disequilibrium among the markers. Further evaluation of the type I error rate and power of the FDR approaches for higher linkage disequilibrium and for haplotype analyses is warranted

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Description of the Framingham Heart Study data for Genetic Analysis Workshop 13

Author: Copenhafer Donna
Cupples L Adrienne
Demissie Serkalem
Levy Daniel
Yang Qiong
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

PubMed Central

Disparities in allele frequencies and population differentiation for 101 disease-associated single nucleotide polymorphisms between Puerto Ricans and Non-Hispanic Whites

Author: Adiconis Xian
Arnett Donna
Demissie Serkalem
Garcia-Bailo Bibiana
Lai Chao-Qiang
Mattei Josiemer
Ordovas Jose M.
Parnell Laurence D.
Shen Jian
Tucker Katherine L.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/08/2009
Field of study

BACKGROUND. Variations in gene allele frequencies can contribute to differences in the prevalence of some common complex diseases among populations. Natural selection modulates the balance in allele frequencies across populations. Population differentiation (FST) can evidence environmental selection pressures. Such genetic information is limited in Puerto Ricans, the second largest Hispanic ethnic group in the US, and a group with high prevalence of chronic disease. We determined allele frequencies and population differentiation for 101 single nucleotide polymorphisms (SNPs) in 30 genes involved in major metabolic and disease-relevant pathways in Puerto Ricans (n = 969, ages 45–75 years) and compared them to similarly aged non-Hispanic whites (NHW) (n = 597). RESULTS. Minor allele frequency (MAF) distributions for 45.5% of the SNPs assessed in Puerto Ricans were significantly different from those of NHW. Puerto Ricans carried risk alleles in higher frequency and protective alleles in lower frequency than NHW. Patterns of population differentiation showed that Puerto Ricans had SNPs with exceptional FST values in intronic, non-synonymous and promoter regions. NHW had exceptional FST values in intronic and promoter region SNPs only. CONCLUSION. These observations may serve to explain and broaden studies on the impact of gene polymorphisms on chronic diseases affecting Puerto Ricans.National Institutes of Health, National Institutes on Aging (P01AG02394, P01AG023394-SI); National Insitutes of Health (53-K06-5-10); US Department of Agriculture Research Service (58-1950-9-001, 58-1950-7-707); National Institutes of Health & Heart, Lung, and Blood Institute (U 01 HL72524, Genetic and Environmental Determinants of Triglycerides, HL54776

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

PubMed Central

Genetic analyses of longitudinal phenotype data: a comparison of univariate methods and a multivariate approach

Author: Atwood Larry D
Chazaro Irmarie
Cui Jing
Cupples L Adrienne
Demissie Serkalem
DeStefano Anita L
Guo Chao-Yu
Larson Martin
Yang Qiong
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: We explored three approaches to heritability and linkage analyses of longitudinal total cholesterol levels (CHOL) in the Genetic Analysis Workshop 13 simulated data without knowing the answers. The first two were univariate approaches and used 1) baseline measure at exam one or 2) summary measures such as mean and slope from multiple exams. The third method was a multivariate approach that directly models multiple measurements on a subject. A variance components model (SOLAR) was employed in the univariate approaches. A mixed regression model with polynomials was employed in the multivariate approach and implemented in SAS/IML. RESULTS: Using the baseline measure at exam 1, we detected all baseline or slope genes contributing a substantial amount (0.08) of variance (LOD > 3). Compared to the baseline measure, the mean measures yielded slightly higher LOD at the slope genes, and a lower LOD at the baseline genes. The slope measure produced a somewhat lower LOD for the slope gene than did the mean measure. Descriptive information on the pattern of changes in gene effects with age was estimated for three linked loci by the third approach. CONCLUSION: We found simple univariate methods may be effective to detect genes affecting longitudinal phenotypes but may not fully reveal temporal trends in gene effects. The relative efficiency of the univariate methods to detect genes depends heavily on the underlying model. Compared with the univariate approaches, the multivariate approach provided more information on temporal trends in gene effects at the cost of more complicated modelling and more intense computations

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

PubMed Central

Large meta-analysis of genome-wide association studies identifies five loci for lean body mass

Author: Chou Wen-Chi
Demissie Serkalem
Demuth Ilja
Hsu Yi-Hsiang
Steinhagen-Thiessen Elisabeth [u.v.m.]
Stolk Lisette
Yerges-Armstrong Laura M.
Zillikens Carola M.
Publication venue
Publication date: 01/01/2017
Field of study

Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p < 5 × 10−8) or suggestively genome wide (p < 2.3 × 10−6). Replication in 63,475 (47,227 of European ancestry) individuals from 33 cohorts for whole body lean body mass and in 45,090 (42,360 of European ancestry) subjects from 25 cohorts for appendicular lean body mass was successful for five single-nucleotide polymorphisms in/near HSD17B11, VCAN, ADAMTSL3, IRS1, and FTO for total lean body mass and for three single-nucleotide polymorphisms in/near VCAN, ADAMTSL3, and IRS1 for appendicular lean body mass. Our findings provide new insight into the genetics of lean body mass

Institutional Repository of the Freie Universität Berlin

Dietary Intake of n-6 Fatty Acids Modulates Effect of Apolipoprotein A5 Gene on Plasma Fasting Triglycerides, Remnant Lipoprotein Concentrations, and Lipoprotein Particle Size

Author: Adiconis Xian
Corella Dolores
Cupples L. Adrienne
Demissie Serkalem
Lai Chao-Qiang
Ordovas Jose M.
Parnell Laurence D.
Tucker Katherine L.
Yueping Zhu
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/01/2006
Field of study

Background— Apolipoprotein A5 gene (APOA5) variation is associated with plasma triglycerides (TGs). However, little is known about whether dietary fat modulates this association. Methods and Results— We investigated the interaction between APOA5 gene variation and dietary fat in determining plasma fasting TGs, remnant-like particle (RLP) concentrations, and lipoprotein particle size in 1001 men and 1147 women who were Framingham Heart Study participants. Polymorphisms –1131T>C and 56C>G, representing 2 independent haplotypes, were analyzed. Significant gene–diet interactions between the –1131T>C polymorphism and polyunsaturated fatty acid (PUFA) intake were found (PG polymorphism. The –1131C allele was associated with higher fasting TGs and RLP concentrations (P6% of total energy). No heterogeneity by sex was found. These interactions showed a dose-response effect when PUFA intake was considered as a continuous variable (P<0.01). Similar interactions were found for the sizes of VLDL and LDL particles. Only in carriers of the –1131C allele did the size of these particles increase (VLDL) or decrease (LDL) as PUFA intake increased (P<0.01). We further analyzed the effects of n-6 and n-3 fatty acids and found that the PUFA–APOA5 interactions were specific for dietary n-6 fatty acids. Conclusions— Higher n-6 (but not n-3) PUFA intake increased fasting TGs, RLP concentrations, and VLDL size and decreased LDL size in APOA5 –1131C carriers, suggesting that n-6 PUFA–rich diets are related to a more atherogenic lipid profile in these subjects.Corella Piquer, Maria Dolores, [email protected]

Crossref

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

Secretaría de Estado de Cultura

Erratum: Large meta-analysis of genome-wide association studies identifies five loci for lean body mass

Author: Chou Wen-Chi
Demissie Serkalem
Hsu Yi-Hsiang
Wichmann Heinz-Erich
Yerges-Armstrong Laura M.
Zillikens M. Carola
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Open Access LMU

Data abstractions for decision tree induction

Author: Cupples L Adrienne
D'Agostino Ralph B
Demissie Serkalem
Fox Caroline S
Hoffmann Udo
Hwang Shih-Jen
Ingellson Erik
Liu Chunyu
Murabito Joanne M
O'Donnell Christopher J
Polak Joseph F
Wolf Philip A
Publication venue: Elsevier Science B.V.
Publication date: 01/01/2003
Field of study

AbstractWhen descriptions of data values in a database are too concrete or too detailed, the computational complexity needed to discover useful knowledge from the database will be generally increased. Furthermore, discovered knowledge tends to become complicated. A notion of data abstraction seems useful to resolve this kind of problems, as we obtain a smaller and more general database after the abstraction, from which we can quickly extract more abstract knowledge that is expected to be easier to understand. In general, however, since there exist several possible abstractions, we have to carefully select one according to which the original database is generalized. An inadequate selection would make the accuracy of extracted knowledge worse.From this point of view, we propose in this paper a method of selecting an appropriate abstraction from possible ones, assuming that our task is to construct a decision tree from a relational database. Suppose that, for each attribute in a relational database, we have a class of possible abstractions for the attribute values. As an appropriate abstraction for each attribute, we prefer an abstraction such that, even after the abstraction, the distribution of target classes necessary to perform our classification task can be preserved within an acceptable error range given by user.By the selected abstractions, the original database can be transformed into a small generalized database written in abstract values. Therefore, it would be expected that, from the generalized database, we can construct a decision tree whose size is much smaller than one constructed from the original database. Furthermore, such a size reduction can be justified under some theoretical assumptions. The appropriateness of abstraction is precisely defined in terms of the standard information theory. Therefore, we call our abstraction framework Information Theoretical Abstraction.We show some experimental results obtained by a system ITA that is an implementation of our abstraction method. From those results, it is verified that our method is very effective in reducing the size of detected decision tree without making classification errors so worse

Elsevier - Publisher Connector

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

PubMed Central

Recommended from our members

Genome-wide association study for subclinical atherosclerosis in major arterial territories in the NHLBI's Framingham Heart Study

Author: Cupples L Adrienne
D'Agostino Ralph B
Demissie Serkalem
Fox Caroline
Hoffmann Udo
Hwang Shih-Jen
Ingellson Erik
Liu Chunyu
Murabito Joanne M
O'Donnell Christopher Joseph
Polak Joseph F
Wolf Philip A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2011
Field of study

Introduction: Subclinical atherosclerosis (SCA) measures in multiple arterial beds are heritable phenotypes that are associated with increased incidence of cardiovascular disease. We conducted a genome-wide association study (GWAS) for SCA measurements in the community-based Framingham Heart Study. Methods: Over 100,000 single nucleotide polymorphisms (SNPs) were genotyped (Human 100K GeneChip, Affymetrix) in 1345 subjects from 310 families. We calculated sex-specific age-adjusted and multivariable-adjusted residuals in subjects tested for quantitative SCA phenotypes, including ankle-brachial index, coronary artery calcification and abdominal aortic calcification using multi-detector computed tomography, and carotid intimal medial thickness (IMT) using carotid ultrasonography. We evaluated associations of these phenotypes with 70,987 autosomal SNPs with minor allele frequency

\geq 0.10

, call rate

\geq 80\%

, and Hardy-Weinberg p-value

\geq 0.001

in samples ranging from 673 to 984 subjects, using linear regression with generalized estimating equations (GEE) methodology and family-based association testing (FBAT). Variance components LOD scores were also calculated. Results: There was no association result meeting criteria for genome-wide significance, but our methods identified 11 SNPs with

p < 10^{-5}

by GEE and five SNPs with

p < 10^{-5}

by FBAT for multivariable-adjusted phenotypes. Among the associated variants were SNPs in or near genes that may be considered candidates for further study, such as

rs1376877 (GEE p < 0.000001, located in ABI2)

for maximum internal carotid artery IMT and rs4814615 (FBAT p = 0.000003, located in PCSK2) for maximum common carotid artery IMT. Modest significant associations were noted with various SCA phenotypes for variants in previously reported atherosclerosis candidate genes, including NOS3 and ESR1. Associations were also noted of a region on chromosome 9p21 with CAC phenotypes that confirm associations with coronary heart disease and CAC in two recently reported genome-wide association studies. In linkage analyses, several regions of genome-wide linkage were noted, confirming previously reported linkage of internal carotid artery IMT on chromosome 12. All GEE, FBAT and linkage results are provided as an open-access results resource at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin\study.cgi?id=phs000007. Conclusion: The results from this GWAS generate hypotheses regarding several SNPs that may be associated with SCA phenotypes in multiple arterial beds. Given the number of tests conducted, subsequent independent replication in a staged approach is essential to identify genetic variants that may be implicated in atherosclerosis

Harvard University - DASH

A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study

Author: Arnett Donna K
Burtt Nöel P
Cupples L Adrienne
D'Agostino Ralph B
Demissie Serkalem
Gianniny Lauren
Guiducci Candace
Kathiresan Sekar
Manning Alisa K
Melander Olle
Ordovas Jose M
Orho-Melander Marju
Peloso Gina M
Surti Aarti
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Blood lipid levels including low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides (TG) are highly heritable. Genome-wide association is a promising approach to map genetic loci related to these heritable phenotypes. Methods In 1087 Framingham Heart Study Offspring cohort participants (mean age 47 years, 52% women), we conducted genome-wide analyses (Affymetrix 100K GeneChip) for fasting blood lipid traits. Total cholesterol, HDL-C, and TG were measured by standard enzymatic methods and LDL-C was calculated using the Friedewald formula. The long-term averages of up to seven measurements of LDL-C, HDL-C, and TG over a ~30 year span were the primary phenotypes. We used generalized estimating equations (GEE), family-based association tests (FBAT) and variance components linkage to investigate the relationships between SNPs (on autosomes, with minor allele frequency ≥10%, genotypic call rate ≥80%, and Hardy-Weinberg equilibrium p ≥ 0.001) and multivariable-adjusted residuals. We pursued a three-stage replication strategy of the GEE association results with 287 SNPs (P < 0.001 in Stage I) tested in Stage II (n ~1450 individuals) and 40 SNPs (P < 0.001 in joint analysis of Stages I and II) tested in Stage III (n~6650 individuals). Results Long-term averages of LDL-C, HDL-C, and TG were highly heritable (h2 = 0.66, 0.69, 0.58, respectively; each P < 0.0001). Of 70,987 tests for each of the phenotypes, two SNPs had p < 10-5 in GEE results for LDL-C, four for HDL-C, and one for TG. For each multivariable-adjusted phenotype, the number of SNPs with association p < 10-4 ranged from 13 to 18 and with p < 10-3, from 94 to 149. Some results confirmed previously reported associations with candidate genes including variation in the lipoprotein lipase gene (<it>LPL</it>) and HDL-C and TG (rs7007797; P = 0.0005 for HDL-C and 0.002 for TG). The full set of GEE, FBAT and linkage results are posted at the database of Genotype and Phenotype (dbGaP). After three stages of replication, there was no convincing statistical evidence for association (i.e., combined P < 10-5 across all three stages) between any of the tested SNPs and lipid phenotypes. Conclusion Using a 100K genome-wide scan, we have generated a set of putative associations for common sequence variants and lipid phenotypes. Validation of selected hypotheses in additional samples did not identify any new loci underlying variability in blood lipids. Lack of replication may be due to inadequate statistical power to detect modest quantitative trait locus effects (i.e., <1% of trait variance explained) or reduced genomic coverage of the 100K array. GWAS in FHS using a denser genome-wide genotyping platform and a better-powered replication strategy may identify novel loci underlying blood lipids.</p

Lund University Publications

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central