717 research outputs found
HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants
The resolution of genome-wide association studies (GWAS) is limited by the linkage disequilibrium (LD) structure of the population being studied. Selecting the most likely causal variants within an LD block is relatively straightforward within coding sequence, but is more difficult when all variants are intergenic. Predicting functional non-coding sequence has been recently facilitated by the availability of conservation and epigenomic information. We present HaploReg, a tool for exploring annotations of the non-coding genome among the results of published GWAS or novel sets of variants. Using LD information from the 1000 Genomes Project, linked SNPs and small indels can be visualized along with their predicted chromatin state in nine cell types, conservation across mammals and their effect on regulatory motifs. Sets of SNPs, such as those resulting from GWAS, are analyzed for an enrichment of cell type-specific enhancers. HaploReg will be useful to researchers developing mechanistic hypotheses of the impact of non-coding variants on clinical phenotypes and normal variation. The HaploReg database is available at http://compbio.mit.edu/HaploReg.National Institutes of Health (U.S.) (R01-HG004037)National Institutes of Health (U.S.) (RC1-HG005334)National Science Foundation (U.S.) (HG005334
Dietary soy and meat proteins induce distinct physiological and gene expression changes in rats
This study reports on a comprehensive comparison of the effects of soy and meat proteins given at the recommended level on physiological markers of metabolic syndrome and the hepatic transcriptome. Male rats were fed semi-synthetic diets for 1 wk that differed only regarding protein source, with casein serving as reference. Body weight gain and adipose tissue mass were significantly reduced by soy but not meat proteins. The insulin resistance index was improved by soy, and to a lesser extent by meat proteins. Liver triacylglycerol contents were reduced by both protein sources, which coincided with increased plasma triacylglycerol concentrations. Both soy and meat proteins changed plasma amino acid patterns. The expression of 1571 and 1369 genes were altered by soy and meat proteins respectively. Functional classification revealed that lipid, energy and amino acid metabolic pathways, as well as insulin signaling pathways were regulated differently by soy and meat proteins. Several transcriptional regulators, including NFE2L2, ATF4, Srebf1 and Rictor were identified as potential key upstream regulators. These results suggest that soy and meat proteins induce distinct physiological and gene expression responses in rats and provide novel evidence and suggestions for the health effects of different protein sources in human diets
A Model-Based Analysis of GC-Biased Gene Conversion in the Human and Chimpanzee Genomes
GC-biased gene conversion (gBGC) is a recombination-associated process that favors the fixation of G/C alleles over A/T alleles. In mammals, gBGC is hypothesized to contribute to variation in GC content, rapidly evolving sequences, and the fixation of deleterious mutations, but its prevalence and general functional consequences remain poorly understood. gBGC is difficult to incorporate into models of molecular evolution and so far has primarily been studied using summary statistics from genomic comparisons. Here, we introduce a new probabilistic model that captures the joint effects of natural selection and gBGC on nucleotide substitution patterns, while allowing for correlations along the genome in these effects. We implemented our model in a computer program, called phastBias, that can accurately detect gBGC tracts about 1 kilobase or longer in simulated sequence alignments. When applied to real primate genome sequences, phastBias predicts gBGC tracts that cover roughly 0.3% of the human and chimpanzee genomes and account for 1.2% of human-chimpanzee nucleotide differences. These tracts fall in clusters, particularly in subtelomeric regions; they are enriched for recombination hotspots and fast-evolving sequences; and they display an ongoing fixation preference for G and C alleles. They are also significantly enriched for disease-associated polymorphisms, suggesting that they contribute to the fixation of deleterious alleles. The gBGC tracts provide a unique window into historical recombination processes along the human and chimpanzee lineages. They supply additional evidence of long-term conservation of megabase-scale recombination rates accompanied by rapid turnover of hotspots. Together, these findings shed new light on the evolutionary, functional, and disease implications of gBGC. The phastBias program and our predicted tracts are freely available. © 2013 Capra et al
Characteristics of transposable element exonization within human and mouse
Insertion of transposed elements within mammalian genes is thought to be an
important contributor to mammalian evolution and speciation. Insertion of
transposed elements into introns can lead to their activation as alternatively
spliced cassette exons, an event called exonization. Elucidation of the
evolutionary constraints that have shaped fixation of transposed elements
within human and mouse protein coding genes and subsequent exonization is
important for understanding of how the exonization process has affected
transcriptome and proteome complexities. Here we show that exonization of
transposed elements is biased towards the beginning of the coding sequence in
both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs)
revealed that exonization of transposed elements can be population-specific,
implying that exonizations may enhance divergence and lead to speciation. SNP
density analysis revealed differences between Alu and other transposed
elements. Finally, we identified cases of primate-specific Alu elements that
depend on RNA editing for their exonization. These results shed light on TE
fixation and the exonization process within human and mouse genes.Comment: 11 pages, 4 figure
Different genes interact with particulate matter and tobacco smoke exposure in affecting lung function decline in the general population
BACKGROUND: Oxidative stress related genes modify the effects of ambient air pollution or tobacco smoking on lung function decline. The impact of interactions might be substantial, but previous studies mostly focused on main effects of single genes. OBJECTIVES: We studied the interaction of both exposures with a broad set of oxidative-stress related candidate genes and pathways on lung function decline and contrasted interactions between exposures. METHODS: For 12679 single nucleotide polymorphisms (SNPs), change in forced expiratory volume in one second (FEV(1)), FEV(1) over forced vital capacity (FEV(1)/FVC), and mean forced expiratory flow between 25 and 75% of the FVC (FEF(25-75)) was regressed on interval exposure to particulate matter >10 microm in diameter (PM10) or packyears smoked (a), additive SNP effects (b), and interaction terms between (a) and (b) in 669 adults with GWAS data. Interaction p-values for 152 genes and 14 pathways were calculated by the adaptive rank truncation product (ARTP) method, and compared between exposures. Interaction effect sizes were contrasted for the strongest SNPs of nominally significant genes (p(interaction)>0.05). Replication was attempted for SNPs with MAF<10% in 3320 SAPALDIA participants without GWAS. RESULTS: On the SNP-level, rs2035268 in gene SNCA accelerated FEV(1)/FVC decline by 3.8% (p(interaction) = 2.5x10(-6)), and rs12190800 in PARK2 attenuated FEV1 decline by 95.1 ml p(interaction) = 9.7x10(-8)) over 11 years, while interacting with PM10. Genes and pathways nominally interacting with PM10 and packyears exposure differed substantially. Gene CRISP2 presented a significant interaction with PM10 (p(interaction) = 3.0x10(-4)) on FEV(1)/FVC decline. Pathway interactions were weak. Replications for the strongest SNPs in PARK2 and CRISP2 were not successful. CONCLUSIONS: Consistent with a stratified response to increasing oxidative stress, different genes and pathways potentially mediate PM10 and tobac smoke effects on lung function decline. Ignoring environmental exposures would miss these patterns, but achieving sufficient sample size and comparability across study samples is challengin
Control of intestinal stem cell function and proliferation by mitochondrial pyruvate metabolism.
Most differentiated cells convert glucose to pyruvate in the cytosol through glycolysis, followed by pyruvate oxidation in the mitochondria. These processes are linked by the mitochondrial pyruvate carrier (MPC), which is required for efficient mitochondrial pyruvate uptake. In contrast, proliferative cells, including many cancer and stem cells, perform glycolysis robustly but limit fractional mitochondrial pyruvate oxidation. We sought to understand the role this transition from glycolysis to pyruvate oxidation plays in stem cell maintenance and differentiation. Loss of the MPC in Lgr5-EGFP-positive stem cells, or treatment of intestinal organoids with an MPC inhibitor, increases proliferation and expands the stem cell compartment. Similarly, genetic deletion of the MPC in Drosophila intestinal stem cells also increases proliferation, whereas MPC overexpression suppresses stem cell proliferation. These data demonstrate that limiting mitochondrial pyruvate metabolism is necessary and sufficient to maintain the proliferation of intestinal stem cells
The UCSC Genome Browser Database: 2008 update
The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrate and 21 invertebrate species as of September 2007. For each assembly, the GBD contains a collection of annotation data aligned to the genomic sequence. Highlights of this year's additions include a 28-species human-based vertebrate conservation annotation, an enhanced UCSC Genes set, and more human variation, MGC, and ENCODE data. The database is optimized for fast interactive performance with a set of web-based tools that may be used to view, manipulate, filter and download the annotation data. New toolset features include the Genome Graphs tool for displaying genome-wide data sets, session saving and sharing, better custom track management, expanded Genome Browser configuration options and a Genome Browser wiki site. The downloadable GBD data, the companion Genome Browser toolset and links to documentation and related information can be found at: http://genome.ucsc.ed
Molecular, phenotypic, and sample-associated data to describe pluripotent stem cell lines and derivatives
The use of induced pluripotent stem cells (iPSC) derived from independent patients and sources holds considerable promise to improve the understanding of development and disease. However, optimized use of iPSC depends on our ability to develop methods to efficiently qualify cell lines and protocols, monitor genetic stability, and evaluate self-renewal and differentiation potential. To accomplish these goals, 57 stem cell lines from 10 laboratories were differentiated to 7 different states, resulting in 248 analyzed samples. Cell lines were differentiated and characterized at a central laboratory using standardized cell culture methodologies, protocols, and metadata descriptors. Stem cell and derived differentiated lines were characterized using RNA-seq, miRNA-seq, copy number arrays, DNA methylation arrays, flow cytometry, and molecular histology. All materials, including raw data, metadata, analysis and processing code, and methodological and provenance documentation are publicly available for re-use and interactive exploration at https://www.synapse.org/pcbc. The goal is to provide data that can improve our ability to robustly and reproducibly use human pluripotent stem cells to understand development and disease
Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data
Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. © 2009 Corcoran et al
Predicting cell types and genetic variations contributing to disease by combining GWAS and epigenetic data
Genome-wide association studies (GWASs) identify single nucleotide polymorphisms (SNPs) that are enriched in individuals suffering from a given disease. Most disease-associated SNPs fall into non-coding regions, so that it is not straightforward to infer phenotype or function; moreover, many SNPs are in tight genetic linkage, so that a SNP identified as associated with a particular disease may not itself be causal, but rather signify the presence of a linked SNP that is functionally relevant to disease pathogenesis. Here, we present an analysis method that takes advantage of the recent rapid accumulation of epigenomics data to address these problems for some SNPs. Using asthma as a prototypic example; we show that non-coding disease-associated SNPs are enriched in genomic regions that function as regulators of transcription, such as enhancers and promoters. Identifying enhancers based on the presence of the histone modification marks such as H3K4me1 in different cell types, we show that the location of enhancers is highly cell-type specific. We use these findings to predict which SNPs are likely to be directly contributing to disease based on their presence in regulatory regions, and in which cell types their effect is expected to be detectable. Moreover, we can also predict which cell types contribute to a disease based on overlap of the disease-associated SNPs with the locations of enhancers present in a given cell type. Finally, we suggest that it will be possible to re-analyze GWAS studies with much higher power by limiting the SNPs considered to those in coding or regulatory regions of cell types relevant to a given disease
- …
