Search CORE

113 research outputs found

A new skew-elliptical distribution and its properties

Author: Chai High S.
Sahu Sujit K.
Publication venue: Southampton Statistical Sciences Research Institute
Publication date: 15/11/2005
Field of study

This article generalizes a multivariate skew-elliptical distribution and describes its many interesting properties. The univariate version of the new distribution is compared with two other currently used distributions. The use of the new distribution is illustrated with a real data example suitable for regression modelling. The new model provides a better model fit than its two rivals as evaluated by some suitable Bayesian model selection criteria

Southampton (e-Prints Soton)

Spatial normalization improves the quality of genotype calling for Affymetrix SNP 6.0 arrays

Author: Bailey Kent R
Chai High Seng
Kocher Jean-Pierre A
Therneau Terry M
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Microarray measurements are susceptible to a variety of experimental artifacts, some of which give rise to systematic biases that are spatially dependent in a unique way on each chip. It is likely that such artifacts affect many SNP arrays, but the normalization methods used in currently available genotyping algorithms make no attempt at spatial bias correction. Here, we propose an effective single-chip spatial bias removal procedure for Affymetrix 6.0 SNP arrays or platforms with similar design features. This procedure deals with both extreme and subtle biases and is intended to be applied before standard genotype calling algorithms. Results Application of the spatial bias adjustments on HapMap samples resulted in higher genotype call rates with equal or even better accuracy for thousands of SNPs. Consequently the normalization procedure is expected to lead to more meaningful biological inferences and could be valuable for genome-wide SNP analysis. Conclusions Spatial normalization can potentially rescue thousands of SNPs in a genetic study at the small cost of computational time. The approach is implemented in R and available from the authors upon request.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

GLOSSI: a method to assess the association of genetic loci-sets with complex diseases

Author: Asmann Yan W
Bailey Kent R
Chai High-Seng
Kocher Jean-Pierre A
Sicotte Hugues
Turner Stephen T
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The developments of high-throughput genotyping technologies, which enable the simultaneous genotyping of hundreds of thousands of single nucleotide polymorphisms (SNP) have the potential to increase the benefits of genetic epidemiology studies. Although the enhanced resolution of these platforms increases the chance of interrogating functional SNPs that are themselves causative or in linkage disequilibrium with causal SNPs, commonly used single SNP-association approaches suffer from serious multiple hypothesis testing problems and provide limited insights into combinations of loci that may contribute to complex diseases. Drawing inspiration from Gene Set Enrichment Analysis developed for gene expression data, we have developed a method, named GLOSSI (Gene-loci Set Analysis), that integrates prior biological knowledge into the statistical analysis of genotyping data to test the association of a group of SNPs (loci-set) with complex disease phenotypes. The most significant loci-sets can be used to formulate hypotheses from a functional viewpoint that can be validated experimentally. Results In a simulation study, GLOSSI showed sufficient power to detect loci-sets with less than 10% of SNPs having moderate-to-large effect sizes and intermediate minor allele frequency values. When applied to a biological dataset where no single SNP-association was found in a previous study, GLOSSI was able to identify several loci-sets that are significantly related to blood pressure response to an antihypertensive drug. Conclusion GLOSSI is valuable for association of SNPs at multiple genetic loci with complex disease phenotypes. In contrast to methods based on the Kolmogorov-Smirnov statistic, the approach is parametric and only utilizes information from within the interrogated loci-set. It properly accounts for dependency among SNPs and allows the testing of loci-sets of any size.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Copy number variation and cytidine analogue cytotoxicity: A genome-wide association approach

Author: Chai High Seng
Hebbring Scott J
Kalari Krishna R
Kocher Jean-Pierre A
Li Liang
Wang Liewei
Weinshilboum Richard M
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

PubMed Central

TREAT: a bioinformatics tool for variant annotations and visualizations in targeted and exome sequencing data

Author: Ahmed A. Hadad
Ansorge
Asha Nair
Asif Hossain
Bonnefond
Eric W. Klee
Gilissen
Goya
Harbour
High-Seng Chai
Jean-Pierre A. Kocher
Krishna R. Kalari
Kumar
Langmead
Li
Lupski
McKenna
Metzker
Ng
Ng
Nix
Patrick H. Duffy
Robinson
Sana
Saurabh Baheti
Schuster
Shetty
Sumit Middha
Wang
Xiaoyu Liu
Yan W. Asmann
Ying Li
Yuji Zhang
Zhifu Sun
Publication venue: Oxford University Press
Publication date
Field of study

Summary: TREAT (Targeted RE-sequencing Annotation Tool) is a tool for facile navigation and mining of the variants from both targeted resequencing and whole exome sequencing. It provides a rich integration of publicly available as well as in-house developed annotations and visualizations for variants, variant-hosting genes and host-gene pathways

Crossref

PubMed Central

A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines

Author: Asif Hossain
Barlund
Brian M. Necela
David W. Williamson
Derek Radisky
E. Aubrey Thompson
Edgren
Edith A. Perez
Futreal
Gary P. Schroth
Guffanti
High-Seng Chai
Huh
Jean-Pierre A. Kocher
Krishna R. Kalari
Langmead
Li
Li
Maher
Maher
Meltzer
Nagl
Pflueger
Sboner
Soda
Sumit Middha
Sun
Tomlins
Wang
Wang
Xiong
Yan W. Asmann
Zhao
Zhifu Sun
Publication venue: Oxford University Press
Publication date
Field of study

SnowShoes-FTD, developed for fusion transcript detection in paired-end mRNA-Seq data, employs multiple steps of false positive filtering to nominate fusion transcripts with near 100% confidence. Unique features include: (i) identification of multiple fusion isoforms from two gene partners; (ii) prediction of genomic rearrangements; (iii) identification of exon fusion boundaries; (iv) generation of a 5′–3′ fusion spanning sequence for PCR validation; and (v) prediction of the protein sequences, including frame shift and amino acid insertions. We applied SnowShoes-FTD to identify 50 fusion candidates in 22 breast cancer and 9 non-transformed cell lines. Five additional fusion candidates with two isoforms were confirmed. In all, 30 of 55 fusion candidates had in-frame protein products. No fusion transcripts were detected in non-transformed cells. Consideration of the possible functions of a subset of predicted fusion proteins suggests several potentially important functions in transformation, including a possible new mechanism for overexpression of ERBB2 in a HER-positive cell line. The source code of SnowShoes-FTD is provided in two formats: one configured to run on the Sun Grid Engine for parallelization, and the other formatted to run on a single LINUX node. Executables in PERL are available for download from our web site: http://mayoresearch.mayo.edu/mayo/research/biostat/stand-alone-packages.cfm

Crossref

PubMed Central

Batch effect correction for genome-wide methylation data with Illumina Infinium platform

Author: A Etcheverry
AE Teschendorff
AH Sims
BA Walker
BH Mecham
BM Bolstad
C Chen
CG Bell
Christopher J Klein
E Eisenberg
High Seng Chai
HM Byun
J Liu
J Luo
J Staaf
Jean-Pierre A Kocher
JT Bell
JT Leek
JT Leek
JY Park
K Kerkel
Krishna V Donkena
M Benito
M Bibikova
M Ko
N Vasiljevic
O Alter
P Du
PW Laird
PW Laird
R Chari
S Sun
Terry M Therneau
Vesna D Garovic
VK Rakyan
WE Johnson
Wendy M White
X Wang
Y Kobayashi
Yanhong Wu
Zhifu Sun
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genome-wide methylation profiling has led to more comprehensive insights into gene regulation mechanisms and potential therapeutic targets. Illumina Human Methylation BeadChip is one of the most commonly used genome-wide methylation platforms. Similar to other microarray experiments, methylation data is susceptible to various technical artifacts, particularly batch effects. To date, little attention has been given to issues related to normalization and batch effect correction for this kind of data. Methods We evaluated three common normalization approaches and investigated their performance in batch effect removal using three datasets with different degrees of batch effects generated from HumanMethylation27 platform: quantile normalization at average β value (QNβ); two step quantile normalization at probe signals implemented in "lumi" package of R (lumi); and quantile normalization of A and B signal separately (ABnorm). Subsequent Empirical Bayes (EB) batch adjustment was also evaluated. Results Each normalization could remove a portion of batch effects and their effectiveness differed depending on the severity of batch effects in a dataset. For the dataset with minor batch effects (Dataset 1), normalization alone appeared adequate and "lumi" showed the best performance. However, all methods left substantial batch effects intact in the datasets with obvious batch effects and further correction was necessary. Without any correction, 50 and 66 percent of CpGs were associated with batch effects in Dataset 2 and 3, respectively. After QNβ, lumi or ABnorm, the number of CpGs associated with batch effects were reduced to 24, 32, and 26 percent for Dataset 2; and 37, 46, and 35 percent for Dataset 3, respectively. Additional EB correction effectively removed such remaining non-biological effects. More importantly, the two-step procedure almost tripled the numbers of CpGs associated with the outcome of interest for the two datasets. Conclusion Genome-wide methylation data from Infinium Methylation BeadChip can be susceptible to batch effects with profound impacts on downstream analyses and conclusions. Normalization can reduce part but not all batch effects. EB correction along with normalization is recommended for effective batch effect removal.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Association of MAPT haplotypes with Alzheimer’s disease risk and MAPT brain gene expression levels

Author: Christopher P Kolbert
Curtis Younkin
Dennis W Dickson
Fanggeng Zou
Gerard D Schellenberg
Gina Bisceglio
High Chai
Jin Jen
John K Kauwe
Jonathan L Haines
Joseph E Parisi
Julia E Crook
Kimberly Malphrus
Li Ma
Lindsay A Farrer
Margaret A Pericak-Vance
Mariet Allen
Michaela Kachadoorian
Minerva M Carrasquillo
Neill R Graff-Radford
Nilüfer Ertekin-Taner
null null
Paul K Crane
Richard Mayeux
Ronald C Petersen
Sarah Lincoln
Shubhabrata Mukherjee
Siddharth Krishnan
Steven G Younkin
Thuy Nguyen
V Pankratz
Zachary Quicksall
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

Introduction: MAPT encodes for tau, the predominant component of neurofibrillary tangles that are neuropathological hallmarks of Alzheimer’s disease (AD). Genetic association of MAPT variants with late-onset AD (LOAD) risk has been inconsistent, although insufficient power and incomplete assessment of MAPT haplotypes may account for this. Methods: We examined the association of MAPT haplotypes with LOAD risk in more than 20,000 subjects (n-cases = 9,814, n-controls = 11,550) from Mayo Clinic (n-cases = 2,052, n-controls = 3,406) and the Alzheimer’s Disease Genetics Consortium (ADGC, n-cases = 7,762, n-controls = 8,144). We also assessed associations with brain MAPT gene expression levels measured in the cerebellum (n = 197) and temporal cortex (n = 202) of LOAD subjects. Six single nucleotide polymorphisms (SNPs) which tag MAPT haplotypes with frequencies greater than 1% were evaluated. Results: H2-haplotype tagging rs8070723-G allele associated with reduced risk of LOAD (odds ratio, OR = 0.90, 95% confidence interval, CI = 0.85-0.95, p = 5.2E-05) with consistent results in the Mayo (OR = 0.81, p = 7.0E-04) and ADGC (OR = 0.89, p = 1.26E-04) cohorts. rs3785883-A allele was also nominally significantly associated with LOAD risk (OR = 1.06, 95% CI = 1.01-1.13, p = 0.034). Haplotype analysis revealed significant global association with LOAD risk in the combined cohort (p = 0.033), with significant association of the H2 haplotype with reduced risk of LOAD as expected (p = 1.53E-04) and suggestive association with additional haplotypes. MAPT SNPs and haplotypes also associated with brain MAPT levels in the cerebellum and temporal cortex of AD subjects with the strongest associations observed for the H2 haplotype and reduced brain MAPT levels (β = -0.16 to -0.20, p = 1.0E-03 to 3.0E-03). Conclusions: These results confirm the previously reported MAPT H2 associations with LOAD risk in two large series, that this haplotype has the strongest effect on brain MAPT expression amongst those tested and identify additional haplotypes with suggestive associations, which require replication in independent series. These biologically congruent results provide compelling evidence to screen the MAPT region for regulatory variants which confer LOAD risk by influencing its brain gene expression

Crossref

Columbia University Academic Commons

Springer - Publisher Connector

PubMed Central

University of Miami: Scholarship Miami

How to discuss gene therapy for haemophilia? A patient and physician perspective

Gene therapy has the potential to revolutionise treatment for patients with haemophilia and is close to entering clinical practice. While factor concentrates have improved outcomes, individuals still face a lifetime of injections, pain, progressive joint damage, the potential for inhibitor development and impaired quality of life. Recently published studies in adeno‐associated viral (AAV) vector‐mediated gene therapy have demonstrated improvement in endogenous factor levels over sustained periods, significant reduction in annualised bleed rates, lower exogenous factor usage and thus far a positive safety profile. In making the shared decision to proceed with gene therapy for haemophilia, physicians should make it clear that research is ongoing and that there are remaining evidence gaps, such as long‐term safety profiles and duration of treatment effect. The eligibility criteria for gene therapy trials mean that key patient groups may be excluded, eg children/adolescents, those with liver or kidney dysfunction and those with a prior history of factor inhibitors or pre‐existing neutralising AAV antibodies. Gene therapy offers a life‐changing opportunity for patients to reduce their bleeding risk while also reducing or abrogating the need for exogenous factor administration. Given the expanding evidence base, both physicians and patients will need sources of clear and reliable information to be able to discuss and judge the risks and benefits of treatment

Crossref

White Rose Research Online

Hochschulschriftenserver - Universität Frankfurt am Main

Bayesian modelling with skew-elliptical distributions

Author: Chai High Seng
Publication venue: 'University of Southampton'
Publication date: 01/01/2004
Field of study

The dissertation is devoted to modelling with a new class of multivariate skew elliptical distributions. This family of distributions extends the elliptical ones by the addition of a vector of shape parameters. It contains the multivariate skew normal, skew Student’s t and skew Cauchy as special cases. Detailed exploration is confined to the case of the univariate skew normal distribution. In particular, salient properties of the density are studied and comparisons are drawn with alternative skew normal proposals. Applications considered include linear regression, variance components and survival models. Bayesian analysis with these models are shown to be easily accomplished through the use of the Gibbs sampler. The latter proves very straightforward to specify distributionally and to implement computationally. Numerical examples show that skew normal modelling is a viable competitor to the celebrated normal theory methods.</p

Southampton (e-Prints Soton)