113 research outputs found

    A new skew-elliptical distribution and its properties

    No full text
    This article generalizes a multivariate skew-elliptical distribution and describes its many interesting properties. The univariate version of the new distribution is compared with two other currently used distributions. The use of the new distribution is illustrated with a real data example suitable for regression modelling. The new model provides a better model fit than its two rivals as evaluated by some suitable Bayesian model selection criteria

    Spatial normalization improves the quality of genotype calling for Affymetrix SNP 6.0 arrays

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray measurements are susceptible to a variety of experimental artifacts, some of which give rise to systematic biases that are spatially dependent in a unique way on each chip. It is likely that such artifacts affect many SNP arrays, but the normalization methods used in currently available genotyping algorithms make no attempt at spatial bias correction. Here, we propose an effective single-chip spatial bias removal procedure for Affymetrix 6.0 SNP arrays or platforms with similar design features. This procedure deals with both extreme and subtle biases and is intended to be applied before standard genotype calling algorithms.</p> <p>Results</p> <p>Application of the spatial bias adjustments on HapMap samples resulted in higher genotype call rates with equal or even better accuracy for thousands of SNPs. Consequently the normalization procedure is expected to lead to more meaningful biological inferences and could be valuable for genome-wide SNP analysis.</p> <p>Conclusions</p> <p>Spatial normalization can potentially rescue thousands of SNPs in a genetic study at the small cost of computational time. The approach is implemented in R and available from the authors upon request.</p

    GLOSSI: a method to assess the association of genetic loci-sets with complex diseases

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The developments of high-throughput genotyping technologies, which enable the simultaneous genotyping of hundreds of thousands of single nucleotide polymorphisms (SNP) have the potential to increase the benefits of genetic epidemiology studies. Although the enhanced resolution of these platforms increases the chance of interrogating functional SNPs that are themselves causative or in linkage disequilibrium with causal SNPs, commonly used single SNP-association approaches suffer from serious multiple hypothesis testing problems and provide limited insights into combinations of loci that may contribute to complex diseases. Drawing inspiration from Gene Set Enrichment Analysis developed for gene expression data, we have developed a method, named GLOSSI (Gene-loci Set Analysis), that integrates prior biological knowledge into the statistical analysis of genotyping data to test the association of a group of SNPs (loci-set) with complex disease phenotypes. The most significant loci-sets can be used to formulate hypotheses from a functional viewpoint that can be validated experimentally.</p> <p>Results</p> <p>In a simulation study, GLOSSI showed sufficient power to detect loci-sets with less than 10% of SNPs having moderate-to-large effect sizes and intermediate minor allele frequency values. When applied to a biological dataset where no single SNP-association was found in a previous study, GLOSSI was able to identify several loci-sets that are significantly related to blood pressure response to an antihypertensive drug.</p> <p>Conclusion</p> <p>GLOSSI is valuable for association of SNPs at multiple genetic loci with complex disease phenotypes. In contrast to methods based on the Kolmogorov-Smirnov statistic, the approach is parametric and only utilizes information from within the interrogated loci-set. It properly accounts for dependency among SNPs and allows the testing of loci-sets of any size.</p

    TREAT: a bioinformatics tool for variant annotations and visualizations in targeted and exome sequencing data

    Get PDF
    Summary: TREAT (Targeted RE-sequencing Annotation Tool) is a tool for facile navigation and mining of the variants from both targeted resequencing and whole exome sequencing. It provides a rich integration of publicly available as well as in-house developed annotations and visualizations for variants, variant-hosting genes and host-gene pathways

    A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines

    Get PDF
    SnowShoes-FTD, developed for fusion transcript detection in paired-end mRNA-Seq data, employs multiple steps of false positive filtering to nominate fusion transcripts with near 100% confidence. Unique features include: (i) identification of multiple fusion isoforms from two gene partners; (ii) prediction of genomic rearrangements; (iii) identification of exon fusion boundaries; (iv) generation of a 5′–3′ fusion spanning sequence for PCR validation; and (v) prediction of the protein sequences, including frame shift and amino acid insertions. We applied SnowShoes-FTD to identify 50 fusion candidates in 22 breast cancer and 9 non-transformed cell lines. Five additional fusion candidates with two isoforms were confirmed. In all, 30 of 55 fusion candidates had in-frame protein products. No fusion transcripts were detected in non-transformed cells. Consideration of the possible functions of a subset of predicted fusion proteins suggests several potentially important functions in transformation, including a possible new mechanism for overexpression of ERBB2 in a HER-positive cell line. The source code of SnowShoes-FTD is provided in two formats: one configured to run on the Sun Grid Engine for parallelization, and the other formatted to run on a single LINUX node. Executables in PERL are available for download from our web site: http://mayoresearch.mayo.edu/mayo/research/biostat/stand-alone-packages.cfm

    Batch effect correction for genome-wide methylation data with Illumina Infinium platform

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide methylation profiling has led to more comprehensive insights into gene regulation mechanisms and potential therapeutic targets. Illumina Human Methylation BeadChip is one of the most commonly used genome-wide methylation platforms. Similar to other microarray experiments, methylation data is susceptible to various technical artifacts, particularly batch effects. To date, little attention has been given to issues related to normalization and batch effect correction for this kind of data.</p> <p>Methods</p> <p>We evaluated three common normalization approaches and investigated their performance in batch effect removal using three datasets with different degrees of batch effects generated from HumanMethylation27 platform: quantile normalization at average β value (QNβ); two step quantile normalization at probe signals implemented in "lumi" package of R (lumi); and quantile normalization of A and B signal separately (ABnorm). Subsequent Empirical Bayes (EB) batch adjustment was also evaluated.</p> <p>Results</p> <p>Each normalization could remove a portion of batch effects and their effectiveness differed depending on the severity of batch effects in a dataset. For the dataset with minor batch effects (Dataset 1), normalization alone appeared adequate and "lumi" showed the best performance. However, all methods left substantial batch effects intact in the datasets with obvious batch effects and further correction was necessary. Without any correction, 50 and 66 percent of CpGs were associated with batch effects in Dataset 2 and 3, respectively. After QNβ, lumi or ABnorm, the number of CpGs associated with batch effects were reduced to 24, 32, and 26 percent for Dataset 2; and 37, 46, and 35 percent for Dataset 3, respectively. Additional EB correction effectively removed such remaining non-biological effects. More importantly, the two-step procedure almost tripled the numbers of CpGs associated with the outcome of interest for the two datasets.</p> <p>Conclusion</p> <p>Genome-wide methylation data from Infinium Methylation BeadChip can be susceptible to batch effects with profound impacts on downstream analyses and conclusions. Normalization can reduce part but not all batch effects. EB correction along with normalization is recommended for effective batch effect removal.</p

    Association of MAPT haplotypes with Alzheimer’s disease risk and MAPT brain gene expression levels

    Get PDF
    Introduction: MAPT encodes for tau, the predominant component of neurofibrillary tangles that are neuropathological hallmarks of Alzheimer’s disease (AD). Genetic association of MAPT variants with late-onset AD (LOAD) risk has been inconsistent, although insufficient power and incomplete assessment of MAPT haplotypes may account for this. Methods: We examined the association of MAPT haplotypes with LOAD risk in more than 20,000 subjects (n-cases = 9,814, n-controls = 11,550) from Mayo Clinic (n-cases = 2,052, n-controls = 3,406) and the Alzheimer’s Disease Genetics Consortium (ADGC, n-cases = 7,762, n-controls = 8,144). We also assessed associations with brain MAPT gene expression levels measured in the cerebellum (n = 197) and temporal cortex (n = 202) of LOAD subjects. Six single nucleotide polymorphisms (SNPs) which tag MAPT haplotypes with frequencies greater than 1% were evaluated. Results: H2-haplotype tagging rs8070723-G allele associated with reduced risk of LOAD (odds ratio, OR = 0.90, 95% confidence interval, CI = 0.85-0.95, p = 5.2E-05) with consistent results in the Mayo (OR = 0.81, p = 7.0E-04) and ADGC (OR = 0.89, p = 1.26E-04) cohorts. rs3785883-A allele was also nominally significantly associated with LOAD risk (OR = 1.06, 95% CI = 1.01-1.13, p = 0.034). Haplotype analysis revealed significant global association with LOAD risk in the combined cohort (p = 0.033), with significant association of the H2 haplotype with reduced risk of LOAD as expected (p = 1.53E-04) and suggestive association with additional haplotypes. MAPT SNPs and haplotypes also associated with brain MAPT levels in the cerebellum and temporal cortex of AD subjects with the strongest associations observed for the H2 haplotype and reduced brain MAPT levels (β = -0.16 to -0.20, p = 1.0E-03 to 3.0E-03). Conclusions: These results confirm the previously reported MAPT H2 associations with LOAD risk in two large series, that this haplotype has the strongest effect on brain MAPT expression amongst those tested and identify additional haplotypes with suggestive associations, which require replication in independent series. These biologically congruent results provide compelling evidence to screen the MAPT region for regulatory variants which confer LOAD risk by influencing its brain gene expression

    How to discuss gene therapy for haemophilia? A patient and physician perspective

    Get PDF
    Gene therapy has the potential to revolutionise treatment for patients with haemophilia and is close to entering clinical practice. While factor concentrates have improved outcomes, individuals still face a lifetime of injections, pain, progressive joint damage, the potential for inhibitor development and impaired quality of life. Recently published studies in adeno‐associated viral (AAV) vector‐mediated gene therapy have demonstrated improvement in endogenous factor levels over sustained periods, significant reduction in annualised bleed rates, lower exogenous factor usage and thus far a positive safety profile. In making the shared decision to proceed with gene therapy for haemophilia, physicians should make it clear that research is ongoing and that there are remaining evidence gaps, such as long‐term safety profiles and duration of treatment effect. The eligibility criteria for gene therapy trials mean that key patient groups may be excluded, eg children/adolescents, those with liver or kidney dysfunction and those with a prior history of factor inhibitors or pre‐existing neutralising AAV antibodies. Gene therapy offers a life‐changing opportunity for patients to reduce their bleeding risk while also reducing or abrogating the need for exogenous factor administration. Given the expanding evidence base, both physicians and patients will need sources of clear and reliable information to be able to discuss and judge the risks and benefits of treatment

    Bayesian modelling with skew-elliptical distributions

    No full text
    The dissertation is devoted to modelling with a new class of multivariate skew elliptical distributions.  This family of distributions extends the elliptical ones by the addition of a vector of shape parameters.  It contains the multivariate skew normal, skew Student’s t and skew Cauchy as special cases. Detailed exploration is confined to the case of the univariate skew normal distribution.  In particular, salient properties of the density are studied and comparisons are drawn with alternative skew normal proposals.  Applications considered include linear regression, variance components and survival models.  Bayesian analysis with these models are shown to be easily accomplished through the use of the Gibbs sampler.  The latter proves very straightforward to specify distributionally and to implement computationally.  Numerical examples show that skew normal modelling is a viable competitor to the celebrated normal theory methods.</p
    corecore