Search CORE

King's Research Portal

AWclust: point-and-click software for non-parametric population structure analysis

Author: A Bowcock
AL Price
B Devlin
B Devlin
B Devlin
B Wu
CJ Hoggart
CJ Hoggart
D Falush
ES Lander
G Guillot
H Tang
J Corander
J Corander
J Marchini
J Mountain
JK Pritchard
Joshua D Starmer
KJ Dawson
L Excoffer
LL Cavalli-Sforza
M Bauchet
M Freedman
M Shriver
N Liu
N Patterson
N Rosenberg
NJ Risch
O Lao
PM McKeigue
R Kaeuffer
R Tibshirani
S Purcell
S Purcell
SL Guthery
X Gao
Xiaoyi Gao
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Population structure analysis is important to genetic association studies and evolutionary investigations. Parametric approaches, e.g. STRUCTURE and L-POP, usually assume Hardy-Weinberg equilibrium (HWE) and linkage equilibrium among loci in sample population individuals. However, the assumptions may not hold and allele frequency estimation may not be accurate in some data sets. The improved version of STRUCTURE (version 2.1) can incorporate linkage information among loci but is still sensitive to high background linkage disequilibrium. Nowadays, large-scale single nucleotide polymorphisms (SNPs) are becoming popular in genetic studies. Therefore, it is imperative to have software that makes full use of these genetic data to generate inference even when model assumptions do not hold or allele frequency estimation suffers from high variation. Results We have developed point-and-click software for non-parametric population structure analysis distributed as an R package. The software takes advantage of the large number of SNPs available to categorize individuals into ethnically similar clusters and it does not require assumptions about population models. Nor does it estimate allele frequencies. Moreover, this software can also infer the optimal number of populations. Conclusion Our software tool employs non-parametric approaches to assign individuals to clusters using SNPs. It provides efficient computation and an intuitive way for researchers to explore ethnic relationships among individuals. It can be complementary to parametric approaches in population structure analysis.</p

Springer - Publisher Connector

Carolina Digital Repository

PGA: power calculator for case-control genetic association analyses

Author: AD Skol
BE Chen
Bingshu E Chen
C Lange
CA Haiman
D Gordon
D Gordon
DE Weeks
DR Nyholt
ES Lander
FM De La Vega
Idan Menashe
J Gudmundsson
J Knight
JH Lubin
JS Witte
LM Ploughman
LR Cardon
M Yeager
N Risch
Philip S Rosenberg
S Purcell
TA Manolio
TM Frayling
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/05/2008
Field of study

Abstract Background Statistical power calculations inform the design and interpretation of genetic association studies, but few programs are tailored to case-control studies of single nucleotide polymorphisms (SNPs) in unrelated subjects. Results We have developed the "Power for Genetic Association analyses" (PGA) package which comprises algorithms and graphical user interfaces for sample size and minimum detectable risk calculations using SNP or haplotype effects under different genetic models and study constrains. The software accounts for linkage disequilibrium and statistical multiple comparisons. The results are presented in graphs or tables and can be printed or exported in standard file formats. Conclusion PGA is user friendly software that can facilitate decision making for association studies of candidate genes, fine-mapping studies, and whole-genome scans. Stand-alone executable files and a Matlab toolbox are available for download at: <url>http://dceg.cancer.gov/bb/tools/pga</url></p

Haplotype Analysis Reveals a Possible Founder Effect of RET Mutation R114H for Hirschsprung's Disease in the Chinese Population

Author: A Chakravarti
A Schuchardt
Amanda Ewart Toland
Belinda K. Cornes
Clara S. Tang
EA Grice
ES Emison
ES Emison
F Lantieri
J Amiel
J Marchini
JP Reeve
Kenneth J. W. S. Hui
M Garcia-Barcelo
M Garcia-Barcelo
M Stephens
M Stephens
M Stephens
Man-Ting So
Maria-Merce Garcia-Barcelo
MM Garcia-Barcelo
MM Garcia-Barcelo
MM Garcia-Barcelo
N Li
Pak C. Sham
Paul K. H. Tam
S Lyonnet
S Purcell
Stacey S. Cherny
Thomas Y. Y. Leon
WP Maksymowych
Xiaoping Miao
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background Hirschsprung's disease (HSCR) is a congenital disorder associated with the lack of intramural ganglion cells in the myenteric and sub-mucosal plexuses along varying segments of the gastrointestinal tract. The RET gene is the major gene implicated in this gastrointestinal disease. A highly recurrent mutation in RET (RETR114H) has recently been identified in ~6-7% of the Chinese HSCR patients which, to date, has not been found in Caucasian patients or controls nor in Chinese controls. Due to the high frequency of RETR114H in this population, we sought to investigate whether this mutation may be a founder HSCR mutation in the Chinese population. Methodology and Principal Findings To test whether all RETR114H were originated from a single mutational event, we predicted the approximate age of RETR114H by applying a Bayesian method to RET SNPs genotyped in 430 Chinese HSCR patients (of whom 25 individuals had the mutation) to be between 4-23 generations old depending on growth rate. We reasoned that if RETR114H was a founder mutation then those with the mutation would share a haplotype on which the mutation resides. Including SNPs spanning 509.31 kb across RET from a recently obtained 500 K genome-wide dataset for a subset of 181 patients (14 RETR114H patients), we applied haplotype estimation methods to determine whether there were any segments shared between patients with RETR114H that are not present in those without the mutation or controls. Analysis yielded a 250.2 kb (51 SNP) shared segment over the RET gene (and downstream) in only those patients with the mutation with no similar segments found among other patients. Conclusions This suggests that RETR114H is a founder mutation for HSCR in the Chinese population. © 2010 Cornes et al.published_or_final_versio

Universidade do Minho: RepositoriUM

HKU Scholars Hub

Incidence and diversity of the fungal genera Aspergillus and Penicillium in Portuguese almonds and chestnuts

Author: AD King
AD King
AN Kaaya
Armando Venâncio
AZ Joffe
BL Teviotdale
DJ Phillips
ES Hoekstra
FCO Freire
J Maroco
JB Silva da
JC Barreira
JC Zak
KM Abdel-Gawad
L Rosso
LO Adebajo
M Jimenez
MA Doster
MA Klich
MA Klich
N Magan
Nelson Lima
O Filtenborg
P Bayman
P Rodrigues
P Rodrigues
Paula Rodrigues
PK Singh
PW Wareing
RA Samson
SL Purcell
TN Sieber
VK Nakai
W Gams
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Almonds (Prunus dulcis (Miller) D.A. Webb) and European (sweet) chestnuts (Castanea sativa Miller) are of great economic and social impact in Mediterranean countries, and in some areas they constitute the main income of rural populations. Despite all efforts to control fungal contamination, toxigenic fungi are ubiquitous in nature and occur regularly in worldwide food supplies, and these nuts are no exception. This work aimed to provide knowledge on the general mycobiota of Portuguese almonds and chestnuts, and its evolution from field to the end of storage. For this matter, 45 field chestnut samples and 36 almond samples (30 field samples and six storage samples) were collected in Trás-os-Montes, Portugal. All fungi belonging to genus Aspergillus were isolated and identified to the section level. Fungi representative of other genera were identified to the genus level. In the field, chestnuts were mainly contaminated with the genera Fusarium, Cladosporium, Alternaria and Penicillium, and the genus Aspergillus was only rarely found, whereas almonds were more contaminated with Aspergillus. In almonds, Aspergillus incidence increased significantly from field to the end of storage, but diversity decreased, with potentially toxigenic isolates belonging to sections Flavi and Nigri becoming more significant and widespread throughout storage. These fungi were determined to be moderately associated, which can be indicative of mycotoxin co-contamination problems if adequate storage conditions are not secured.P. Rodrigues was supported by grants SFRH/BD/28332/2006 from Fundacao para a Ciencia e a Tecnologia (FCT), and SFRH/PROTEC/49555/2009 from FCT and Polytechnic Institute of Braganca, Portugal

Biblioteca Digital do IPB

Using an Uncertainty-Coding Matrix in Bayesian Regression Models for Haplotype-Specific Risk Detection in Family Association Studies

Author: A Krishna
AP Morris
C Herold
CA Ross
Chuhsing Kate Hsiao
CM Liu
D Schaid
D Zaykin
DY Lin
E Walker
EG Holliday
ES Soofi
FK Mensah
H Seltman
H Seltman
HC Tsuang
HG Hwu
HJ Cordell
JY Tzeng
M Tsuang
Mei-Hsien Lee
MH Lee
P Sham
RP Igo
S Horvath
S Purcell
S Saha
SH Lin
T Becker
Thomas Mailund
W Guo
Wei J. Chen
Yung-Hsiang Huang
Publication venue: Public Library of Science
Publication date: 15/07/2011
Field of study

Haplotype association studies based on family genotype data can provide more biological information than single marker association studies. Difficulties arise, however, in the inference of haplotype phase determination and in haplotype transmission/non-transmission status. Incorporation of the uncertainty associated with haplotype inference into regression models requires special care. This task can get even more complicated when the genetic region contains a large number of haplotypes. To avoid the curse of dimensionality, we employ a clustering algorithm based on the evolutionary relationship among haplotypes and retain for regression analysis only the ancestral core haplotypes identified by it. To integrate the three sources of variation, phase ambiguity, transmission status and ancestral uncertainty, we propose an uncertainty-coding matrix which combines these three types of variability simultaneously. Next we evaluate haplotype risk with the use of such a matrix in a Bayesian conditional logistic regression model. Simulation studies and one application, a schizophrenia multiplex family study, are presented and the results are compared with those from other family based analysis tools such as FBAT. Our proposed method (Bayesian regression using uncertainty-coding matrix, BRUCM) is shown to perform better and the implementation in R is freely available

National Taiwan University Repository

Genome-Wide Effects of Long-Term Divergent Selection

Author: A-L Raquin
AJ Berry
Anna M. Johansson
Bruce Walsh
CC Laurie
CJ Rubin
DS Falconer
EA Dunnington
ES Buckler
GL Marquez
H Teotónio
H-B Park
HA Orr
J Hermisson
J Maynard Smith
JH Gillespie
L Jacobsson
M Kimura
M Przeworski
Mats E. Pettersson
MN Weedon
P Wahlberg
Paul B. Siegel
PC Sabeti
PS Pennings
PS Pennings
S Purcell
S Wright
SP Otto
WE Castle
WG Hill
WG Hill
ZB Zeng
Ö Carlborg
Örjan Carlborg
Publication venue: Public Library of Science
Publication date: 01/11/2010
Field of study

To understand the genetic mechanisms leading to phenotypic differentiation, it is important to identify genomic regions under selection. We scanned the genome of two chicken lines from a single trait selection experiment, where 50 generations of selection have resulted in a 9-fold difference in body weight. Analyses of nearly 60,000 SNP markers showed that the effects of selection on the genome are dramatic. The lines were fixed for alternative alleles in more than 50 regions as a result of selection. Another 10 regions displayed strong evidence for ongoing differentiation during the last 10 generations. Many more regions across the genome showed large differences in allele frequency between the lines, indicating that the phenotypic evolution in the lines in 50 generations is the result of an exploitation of standing genetic variation at 100s of loci across the genome

Fine Mapping of the NRG1 Hirschsprung's Disease Locus

The primary pathology of Hirschsprung's disease (HSCR, colon aganglionosis) is the absence of ganglia in variable lengths of the hindgut, resulting in functional obstruction. HSCR is attributed to a failure of migration of the enteric ganglion precursors along the developing gut. RET is a key regulator of the development of the enteric nervous system (ENS) and the major HSCR-causing gene. Yet the reduced penetrance of RET DNA HSCR-associated variants together with the phenotypic variability suggest the involvement of additional genes in the disease. Through a genome-wide association study, we uncovered a ∼350 kb HSCR-associated region encompassing part of the neuregulin-1 gene (NRG1). To identify the causal NRG1 variants contributing to HSCR, we genotyped 243 SNPs variants on 343 ethnic Chinese HSCR patients and 359 controls. Genotype analysis coupled with imputation narrowed down the HSCR-associated region to 21 kb, with four of the most associated SNPs (rs10088313, rs10094655, rs4624987, and rs3884552) mapping to the NRG1 promoter. We investigated whether there was correlation between the genotype at the rs10088313 locus and the amount of NRG1 expressed in human gut tissues (40 patients and 21 controls) and found differences in expression as a function of genotype. We also found significant differences in NRG1 expression levels between diseased and control individuals bearing the same rs10088313 risk genotype. This indicates that the effects of NRG1 common variants are likely to depend on other alleles or epigenetic factors present in the patients and would account for the variability in the genetic predisposition to HSCR

HKU Scholars Hub

No association between polymorphisms of WNT2 and schizophrenia in a Korean population

Abstract Background Wingless-type MMTV integration site family member 2 (WNT2) has a potentially important role in neuronal development; however, there has yet to be an investigation into the association between single nucleotide polymorphisms (SNPs) of <it>WNT2 </it>and schizophrenia. This study aimed to determine whether certain SNPs of <it>WNT2 </it>were associated with schizophrenia in a Korean population. Methods e genotyped 7 selected SNPs in the <it>WNT2 </it>gene region (approximately 46 Kb) using direct sequencing in 288 patients with schizophrenia and 305 healthy controls. Results Of the SNPs examined, one SNP showed a weak association with schizophrenia (p = 0.017 in the recessive model). However, this association did not remain statistically significant after Bonferroni correction. Conclusion The present study does not support a major role for <it>WNT2 </it>in schizophrenia. This could be due to the size of the population. Therefore, additional studies would be needed to definitively rule out the gene's minor effects.</p

Springer - Publisher Connector

Genomic Regions Identified by Overlapping Clusters of Nominally-Positive SNPs from Genome-Wide Studies of Alcohol and Illegal Substance Dependence

Author: AM Persico
C Johnson
C Johnson
C Johnson
C Johnson
Catherine Johnson
Donna Walther
ES Lander
G Joslyn
George R. Uhl
GR Uhl
GR Uhl
GR Uhl
GR Uhl
GR Uhl
GR Uhl
GR Uhl
GS Hageman
J Treutlein
JC Lambert
JI Nurnberger Jr
JL Haines
JP McElroy
JZ Liu
KS Kendler
LJ Bierut
LJ Bierut
LJ Bierut
LM Karkowski
MT Tsuang
QR Liu
QR Liu
S Jiao
S Purcell
SS Smith
T Drgon
T Drgon
T Drgon
TE Thorgeirsson
TE Thorgeirsson
Thomas Mailund
Tomas Drgon
WD Dupont
WD Dupont
WR True
Y Xie
Publication venue: Public Library of Science
Publication date: 27/07/2011
Field of study

Declaring “replication” from results of genome wide association (GWA) studies is straightforward when major gene effects provide genome-wide significance for association of the same allele of the same SNP in each of multiple independent samples. However, such unambiguous replication is unlikely when phenotypes display polygenic genetic architecture, allelic heterogeneity, locus heterogeneity and when different samples display linkage disequilibria with different fine structures. We seek chromosomal regions that are tagged by clustered SNPs that display nominally-significant association in each of several independent samples. This approach provides one “nontemplate” approach to identifying overall replication of groups of GWA results in the face of difficult genetic architectures. We apply this strategy to 1 M SNP GWA results for dependence on: a) alcohol (including many individuals with dependence on other addictive substances) and b) at least one illegal substance (including many individuals dependent on alcohol). This approach provides high confidence in rejecting the null hypothesis that chance alone accounts for the extent to which clustered, nominally-significant SNPs from samples of the same racial/ethnic background identify the same sets of chromosomal regions. It identifies several genes that are also reported in other independent alcohol-dependence GWA datasets. There is more modest confidence in: a) identification of individual chromosomal regions and genes that are not also identified by data from other independent samples, b) the more modest overlap between results from samples of different racial/ethnic backgrounds and c) the extent to which any gene not identified herein is excluded, since the power of each of these individual samples is modest. Nevertheless, the strong overlap identified among the samples with similar racial/ethnic backgrounds supports contributions to individual differences in vulnerability to addictions that come from newer allelic variants that are common in subsets of current humans