33 research outputs found

    BS Seeker: precise mapping for bisulfite sequencing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Bisulfite sequencing using next generation sequencers yields genome-wide measurements of DNA methylation at single nucleotide resolution. Traditional aligners are not designed for mapping bisulfite-treated reads, where the unmethylated Cs are converted to Ts. We have developed BS Seeker, an approach that converts the genome to a three-letter alphabet and uses Bowtie to align bisulfite-treated reads to a reference genome. It uses sequence tags to reduce mapping ambiguity. Post-processing of the alignments removes non-unique and low-quality mappings.</p> <p>Results</p> <p>We tested our aligner on synthetic data, a bisulfite-converted <it>Arabidopsis </it>library, and human libraries generated from two different experimental protocols. We evaluated the performance of our approach and compared it to other bisulfite aligners. The results demonstrate that among the aligners tested, BS Seeker is more versatile and faster. When mapping to the human genome, BS Seeker generates alignments significantly faster than RMAP and BSMAP. Furthermore, BS Seeker is the only alignment tool that can explicitly account for tags which are generated by certain library construction protocols.</p> <p>Conclusions</p> <p>BS Seeker provides fast and accurate mapping of bisulfite-converted reads. It can work with BS reads generated from the two different experimental protocols, and is able to efficiently map reads to large mammalian genomes. The Python program is freely available at <url>http://pellegrini.mcdb.ucla.edu/BS_Seeker/BS_Seeker.html</url>.</p

    Detecting coordinated regulation of multi-protein complexes using logic analysis of gene expression

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many of the functional units in cells are multi-protein complexes such as RNA polymerase, the ribosome, and the proteasome. For such units to work together, one might expect a high level of regulation to enable co-appearance or repression of sets of complexes at the required time. However, this type of coordinated regulation between whole complexes is difficult to detect by existing methods for analyzing mRNA co-expression. We propose a new methodology that is able to detect such higher order relationships.</p> <p>Results</p> <p>We detect coordinated regulation of multiple protein complexes using <it>logic analysis </it>of gene expression data. Specifically, we identify gene triplets composed of genes whose expression profiles are found to be related by various types of logic functions. In order to focus on complexes, we associate the members of a gene triplet with the distinct protein complexes to which they belong. In this way, we identify complexes related by specific kinds of regulatory relationships. For example, we may find that the transcription of complex C is increased only if the transcription of both complex A AND complex B is repressed. We identify hundreds of examples of coordinated regulation among complexes under various stress conditions. Many of these examples involve the ribosome. Some of our examples have been previously identified in the literature, while others are novel. One notable example is the relationship between the transcription of the ribosome, RNA polymerase and mannosyltransferase II, which is involved in N-linked glycan processing in the Golgi.</p> <p>Conclusions</p> <p>The analysis proposed here focuses on relationships among triplets of genes that are not evident when genes are examined in a pairwise fashion as in typical clustering methods. By grouping gene triplets, we are able to decipher coordinated regulation among sets of three complexes. Moreover, using all triplets that involve coordinated regulation with the ribosome, we derive a large network involving this essential cellular complex. In this network we find that all multi-protein complexes that belong to the same functional class are regulated in the same direction as a group (either induced or repressed).</p

    Modeling the regulatory network of histone acetylation in Saccharomyces cerevisiae

    Get PDF
    Acetylation of histones plays an important role in regulating transcription. Histone acetylation is mediated partly by the recruitment of specific histone acetyltransferases (HATs) and deacetylases (HDACs) to genomic loci by transcription factors, resulting in modulation of gene expression. Although several specific interactions between transcription factors and HATs and HDACs have been elaborated in Saccharomyces cerevisiae, the full regulatory network remains uncharacterized. We have utilized a linear regression of optimized sigmoidal functions to correlate transcription factor binding patterns to the acetylation profiles of 11 lysines in the four core histones measured at all S. cerevisiae promoters. The resulting associations are combined with large-scale protein–protein interaction data sets to generate a comprehensive model that relates recruitment of specific HDACs and HATs to transcription factors and their target genes and the resulting effects on individual lysines. This model provides a broad and detailed view of the regulatory network, describing which transcription factors are most significant in regulating acetylation of specific lysines at defined promoters. We validate the model, both computationally and experimentally, to demonstrate that it yields accurate predictions of these regulatory mechanisms

    Tissue-Specific Transcriptomes Reveal Gene Expression Trajectories in Two Maturing Skin Epithelial Layers in Zebrafish Embryos.

    Get PDF
    Epithelial cells are the building blocks of many organs, including skin. The vertebrate skin initially consists of two epithelial layers, the outer periderm and inner basal cell layers, which have distinct properties, functions, and fates. The embryonic periderm ultimately disappears during development, whereas basal cells proliferate to form the mature, stratified epidermis. Although much is known about mechanisms of homeostasis in mature skin, relatively little is known about the two cell types in pre-stratification skin. To define the similarities and distinctions between periderm and basal skin epithelial cells, we purified them from zebrafish at early development stages and deeply profiled their gene expression. These analyses identified groups of genes whose tissue enrichment changed at each stage, defining gene flow dynamics of maturing vertebrate epithelia. At each of 52 and 72 hr post-fertilization (hpf), more than 60% of genes enriched in skin cells were similarly expressed in both layers, indicating that they were common epithelial genes, but many others were enriched in one layer or the other. Both expected and novel genes were enriched in periderm and basal cell layers. Genes encoding extracellular matrix, junctional, cytoskeletal, and signaling proteins were prominent among those distinguishing the two epithelial cell types. In situ hybridization and BAC transgenes confirmed our expression data and provided new tools to study zebrafish skin. Collectively, these data provide a resource for studying common and distinguishing features of maturing epithelia

    Propionibacterium acnes bacteriophages display limited genetic diversity and broad killing activity against bacterial skin isolates.

    Get PDF
    UnlabelledInvestigation of the human microbiome has revealed diverse and complex microbial communities at distinct anatomic sites. The microbiome of the human sebaceous follicle provides a tractable model in which to study its dominant bacterial inhabitant, Propionibacterium acnes, which is thought to contribute to the pathogenesis of the human disease acne. To explore the diversity of the bacteriophages that infect P. acnes, 11 P. acnes phages were isolated from the sebaceous follicles of donors with healthy skin or acne and their genomes were sequenced. Comparative genomic analysis of the P. acnes phage population, which spans a 30-year temporal period and a broad geographic range, reveals striking similarity in terms of genome length, percent GC content, nucleotide identity (&gt;85%), and gene content. This was unexpected, given the far-ranging diversity observed in virtually all other phage populations. Although the P. acnes phages display a broad host range against clinical isolates of P. acnes, two bacterial isolates were resistant to many of these phages. Moreover, the patterns of phage resistance correlate closely with the presence of clustered regularly interspaced short palindromic repeat elements in the bacteria that target a specific subset of phages, conferring a system of prokaryotic innate immunity. The limited diversity of the P. acnes bacteriophages, which may relate to the unique evolutionary constraints imposed by the lipid-rich anaerobic environment in which their bacterial hosts reside, points to the potential utility of phage-based antimicrobial therapy for acne.ImportancePropionibacterium acnes is a dominant member of the skin microflora and has also been implicated in the pathogenesis of acne; however, little is known about the bacteriophages that coexist with and infect this bacterium. Here we present the novel genome sequences of 11 P. acnes phages, thereby substantially increasing the amount of available genomic information about this phage population. Surprisingly, we find that, unlike other well-studied bacteriophages, P. acnes phages are highly homogeneous and show a striking lack of genetic diversity, which is perhaps related to their unique and restricted habitat. They also share a broad ability to kill clinical isolates of P. acnes; phage resistance is not prevalent, but when detected, it appears to be conferred by chromosomally encoded immunity elements within the host genome. We believe that these phages display numerous features that would make them ideal candidates for the development of a phage-based therapy for acne

    MTHFD1 controls DNA methylation in Arabidopsis.

    Get PDF
    DNA methylation is an epigenetic mechanism that has important functions in transcriptional silencing and is associated with repressive histone methylation (H3K9me). To further investigate silencing mechanisms, we screened a mutagenized Arabidopsis thaliana population for expression of SDCpro-GFP, redundantly controlled by DNA methyltransferases DRM2 and CMT3. Here, we identify the hypomorphic mutant mthfd1-1, carrying a mutation (R175Q) in the cytoplasmic bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase (MTHFD1). Decreased levels of oxidized tetrahydrofolates in mthfd1-1 and lethality of loss-of-function demonstrate the essential enzymatic role of MTHFD1 in Arabidopsis. Accumulation of homocysteine and S-adenosylhomocysteine, genome-wide DNA hypomethylation, loss of H3K9me and transposon derepression indicate that S-adenosylmethionine-dependent transmethylation is inhibited in mthfd1-1. Comparative analysis of DNA methylation revealed that the CMT3 and CMT2 pathways involving positive feedback with H3K9me are mostly affected. Our work highlights the sensitivity of epigenetic networks to one-carbon metabolism due to their common S-adenosylmethionine-dependent transmethylation and has implications for human MTHFD1-associated diseases

    Algal Functional Annotation Tool: a web-based analysis suite to functionally interpret large gene lists using integrated annotation and expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data.</p> <p>Description</p> <p>The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga <it>Chlamydomonas reinhardtii</it>, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG pathway maps and batch gene identifier conversion.</p> <p>Conclusions</p> <p>The Algal Functional Annotation Tool aims to provide an integrated data-mining environment for algal genomics by combining data from multiple annotation databases into a centralized tool. This site is designed to expedite the process of functional annotation and the interpretation of gene lists, such as those derived from high-throughput RNA-seq experiments. The tool is publicly available at <url>http://pathways.mcdb.ucla.edu</url>.</p
    corecore