174 research outputs found
A graph theoretic approach to testing associations between disparate sources of functional genomic data
The last few years have seen the advent of high-throughput technologies to analyze various properties of the transcriptome and proteome of several organisms. The congruency of these different data sources, or lack thereof, can shed light on the mechanisms that govern cellular function. A central challenge for bioinformatics research is to develop a unified framework for combining the multiple sources of functional genomics information and testing associations between them, thus obtaining a robust and integrated view of the underlying biology.
We present a graph theoretic approach to test the significance of the association between multiple disparate sources of functional genomics data by proposing two statistical tests, namely edge permutation and node label permutation tests. We demonstrate the use of the proposed tests by finding significant association between a Gene Ontology-derived predictome and data obtained from mRNA expression and phenotypic experiments for Saccharomyces cerevisiae. Moreover, we employ the graph theoretic framework to recast a surprising discrepancy presented in Giaever et al. (2002) between gene expression and knockout phenotype, using expression data from a different set of experiments
Toward accurate high-throughput SNP genotyping in the presence of inherited copy number variation
<p>Abstract</p> <p>Background</p> <p>The recent discovery of widespread copy number variation in humans has forced a shift away from the assumption of two copies per locus per cell throughout the autosomal genome. In particular, a SNP site can no longer always be accurately assigned one of three genotypes in an individual. In the presence of copy number variability, the individual may theoretically harbor any number of copies of each of the two SNP alleles.</p> <p>Results</p> <p>To address this issue, we have developed a method to infer a "generalized genotype" from raw SNP microarray data. Here we apply our approach to data from 48 individuals and uncover thousands of aberrant SNPs, most in regions that were previously unreported as copy number variants. We show that our allele-specific copy numbers follow Mendelian inheritance patterns that would be obscured in the absence of SNP allele information. The interplay between duplication and point mutation in our data shed light on the relative frequencies of these events in human history, showing that at least some of the duplication events were recurrent.</p> <p>Conclusion</p> <p>This new multi-allelic view of SNPs has a complicated role in disease association studies, and further work will be necessary in order to accurately assess its importance. Software to perform generalized genotyping from SNP array data is freely available online <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p
Recommended from our members
Allelic Selection of Amplicons in Glioblastoma Revealed by Combining Somatic and Germline Analysis
Cancer is a disease driven by a combination of inherited risk alleles coupled with the acquisition of somatic mutations, including amplification and deletion of genomic DNA. Potential relationships between the inherited and somatic aspects of the disease have only rarely been examined on a genome-wide level. Applying a novel integrative analysis of SNP and copy number measurements, we queried the tumor and normal-tissue genomes of 178 glioblastoma patients from the Cancer Genome Atlas project for preferentially amplified alleles, under the hypothesis that oncogenic germline variants will be selectively amplified in the tumor environment. Selected alleles are revealed by allelic imbalance in amplification across samples. This general approach is based on genetic principles and provides a method for identifying important tumor-related alleles. We find that SNP alleles that are most significantly overrepresented in amplicons tend to occur in genes involved with regulation of kinase and transferase activity, and many of these genes are known contributors to gliomagenesis. The analysis also implicates variants in synapse genes. By incorporating gene expression data, we demonstrate synergy between preferential allelic amplification and expression in DOCK4 and EGFR. Our results support the notion that combining germline and tumor genetic data can identify regions relevant to cancer biology
An optimization framework for unsupervised identification of rare copy number variation from SNP array data
A highly sensitive and configurable method for calling copy number variants from SNP array data is presented that can identify even rare CNV
Recommended from our members
Cancer gene mutation discovery and detection using array-based resequencing
Allele-Specific Amplification in Cancer Revealed by SNP Array Analysis
Amplification, deletion, and loss of heterozygosity of genomic DNA are hallmarks of cancer. In recent years a variety of studies have emerged measuring total chromosomal copy number at increasingly high resolution. Similarly, loss-of-heterozygosity events have been finely mapped using high-throughput genotyping technologies. We have developed a probe-level allele-specific quantitation procedure that extracts both copy number and allelotype information from single nucleotide polymorphism (SNP) array data to arrive at allele-specific copy number across the genome. Our approach applies an expectation-maximization algorithm to a model derived from a novel classification of SNP array probes. This method is the first to our knowledge that is able to (a) determine the generalized genotype of aberrant samples at each SNP site (e.g., CCCCT at an amplified site), and (b) infer the copy number of each parental chromosome across the genome. With this method, we are able to determine not just where amplifications and deletions occur, but also the haplotype of the region being amplified or deleted. The merit of our model and general approach is demonstrated by very precise genotyping of normal samples, and our allele-specific copy number inferences are validated using PCR experiments. Applying our method to a collection of lung cancer samples, we are able to conclude that amplification is essentially monoallelic, as would be expected under the mechanisms currently believed responsible for gene amplification. This suggests that a specific parental chromosome may be targeted for amplification, whether because of germ line or somatic variation. An R software package containing the methods described in this paper is freely available at http://genome.dfci.harvard.edu/~tlaframb/PLASQ
Circulating microbial content in myeloid malignancy patients is associated with disease subtypes and patient outcomes
Although recent work has described the microbiome in solid tumors, microbial content in hematological malignancies is not well-characterized. Here we analyze existing deep DNA sequence data from the blood and bone marrow of 1870 patients with myeloid malignancies, along with healthy controls, for bacterial, fungal, and viral content. After strict quality filtering, we find evidence for dysbiosis in disease cases, and distinct microbial signatures among disease subtypes. We also find that microbial content is associated with host gene mutations and with myeloblast cell percentages. In patients with low-risk myelodysplastic syndrome, we provide evidence that Epstein-Barr virus status refines risk stratification into more precise categories than the current standard. Motivated by these observations, we construct machine-learning classifiers that can discriminate among disease subtypes based solely on bacterial content. Our study highlights the association between the circulating microbiome and patient outcome, and its relationship with disease subtype
Human Female Genital Tract Infection by the Obligate Intracellular Bacterium Chlamydia trachomatis Elicits Robust Type 2 Immunity
While Chlamydia trachomatis infections are frequently asymptomatic, mechanisms that regulate host response to this intracellular Gram-negative bacterium remain undefined. This investigation thus used peripheral blood mononuclear cells and endometrial tissue from women with or without Chlamydia genital tract infection to better define this response. Initial genome-wide microarray analysis revealed highly elevated expression of matrix metalloproteinase 10 and other molecules characteristic of Type 2 immunity (e.g., fibrosis and wound repair) in Chlamydia-infected tissue. This result was corroborated in flow cytometry and immunohistochemistry studies that showed extant upper genital tract Chlamydia infection was associated with increased co-expression of CD200 receptor and CD206 (markers of alternative macrophage activation) by endometrial macrophages as well as increased expression of GATA-3 (the transcription factor regulating TH2 differentiation) by endometrial CD4+ T cells. Also among women with genital tract Chlamydia infection, peripheral CD3+ CD4+ and CD3+ CD4- cells that proliferated in response to ex vivo stimulation with inactivated chlamydial antigen secreted significantly more interleukin (IL)-4 than tumor necrosis factor, interferon-γ, or IL-17; findings that repeated in T cells isolated from these same women 1 and 4 months after infection had been eradicated. Our results thus newly reveal that genital infection by an obligate intracellular bacterium induces polarization towards Type 2 immunity, including Chlamydia-specific TH2 development. Based on these findings, we now speculate that Type 2 immunity was selected by evolution as the host response to C. trachomatis in the human female genital tract to control infection and minimize immunopathological damage to vital reproductive structures. © 2013 Vicetti Miguel et al
- …