7 research outputs found

    Combining in silico docking and molecular dynamics simulations to predict the impact of mutations on the substrate specificity of BTL2 lipase

    Get PDF
    Lipases are enzymes that hydrolyze the ester bond between acyl groups and glycerol in triacylglycerides which gives the products of glycerol and fatty acids. Bacillus thermocatenulatus lipase (BTL2) has shown highest activity toward tributyrin (C4) as substrate. While broad selectivity on the chain length of the fatty acids has a key role in waste water treatment, and laundry formulations; short chain length specificity can be used in the food and cosmetic industry. In order to predict its chain length substrate specificity (tributyrin (C4)/tricaprylin (C8)) upon mutation, we developed a scoring function which combines in silico docking and molecular dynamics tools. After calibration on experimentally validated mutants, our scoring function is able to discriminate substrates specificities and predict the impact of a mutation (whether it enhances or reduces) in a rapid and accurate manner (overall correlation r=0.7930, p=0.0007). Also ranking of substrate specificities within the mutants were 100% correct. This method can be powerfully adapted to other protein families to predict the effect of a mutation for the one specific substrate or multiple substrates

    DolphinNext: a distributed data processing platform for high throughput genomics

    Get PDF
    BACKGROUND: The emergence of high throughput technologies that produce vast amounts of genomic data, such as next-generation sequencing (NGS) is transforming biological research. The dramatic increase in the volume of data, the variety and continuous change of data processing tools, algorithms and databases make analysis the main bottleneck for scientific discovery. The processing of high throughput datasets typically involves many different computational programs, each of which performs a specific step in a pipeline. Given the wide range of applications and organizational infrastructures, there is a great need for highly parallel, flexible, portable, and reproducible data processing frameworks. Several platforms currently exist for the design and execution of complex pipelines. Unfortunately, current platforms lack the necessary combination of parallelism, portability, flexibility and/or reproducibility that are required by the current research environment. To address these shortcomings, workflow frameworks that provide a platform to develop and share portable pipelines have recently arisen. We complement these new platforms by providing a graphical user interface to create, maintain, and execute complex pipelines. Such a platform will simplify robust and reproducible workflow creation for non-technical users as well as provide a robust platform to maintain pipelines for large organizations. RESULTS: To simplify development, maintenance, and execution of complex pipelines we created DolphinNext. DolphinNext facilitates building and deployment of complex pipelines using a modular approach implemented in a graphical interface that relies on the powerful Nextflow workflow framework by providing 1. A drag and drop user interface that visualizes pipelines and allows users to create pipelines without familiarity in underlying programming languages. 2. Modules to execute and monitor pipelines in distributed computing environments such as high-performance clusters and/or cloud 3. Reproducible pipelines with version tracking and stand-alone versions that can be run independently. 4. Modular process design with process revisioning support to increase reusability and pipeline development efficiency. 5. Pipeline sharing with GitHub and automated testing 6. Extensive reports with R-markdown and shiny support for interactive data visualization and analysis. CONCLUSION: DolphinNext is a flexible, intuitive, web-based data processing and analysis platform that enables creating, deploying, sharing, and executing complex Nextflow pipelines with extensive revisioning and interactive reporting to enhance reproducible results

    An atlas of cell types in the mouse epididymis and vas deferens

    Get PDF
    Following testicular spermatogenesis, mammalian sperm continue to mature in a long epithelial tube known as the epididymis, which plays key roles in remodeling sperm protein, lipid, and RNA composition. To understand the roles for the epididymis in reproductive biology, we generated a single-cell atlas of the murine epididymis and vas deferens. We recovered key epithelial cell types including principal cells, clear cells, and basal cells, along with associated support cells that include fibroblasts, smooth muscle, macrophages and other immune cells. Moreover, our data illuminate extensive regional specialization of principal cell populations across the length of the epididymis. In addition to region-specific specialization of principal cells, we find evidence for functionally specialized subpopulations of stromal cells, and, most notably, two distinct populations of clear cells. Our dataset extends on existing knowledge of epididymal biology, and provides a wealth of information on potential regulatory and signaling factors that bear future investigation

    An improved zebrafish transcriptome annotation for sensitive and comprehensive detection of cell type-specific genes

    Get PDF
    The zebrafish is ideal for studying embryogenesis and is increasingly applied to model human disease. In these contexts, RNA-sequencing (RNA-seq) provides mechanistic insights by identifying transcriptome changes between experimental conditions. Application of RNA-seq relies on accurate transcript annotation for a genome of interest. Here, we find discrepancies in analysis from RNA-seq datasets quantified using Ensembl and RefSeq zebrafish annotations. These issues were due, in part, to variably annotated 3\u27 untranslated regions and thousands of gene models missing from each annotation. Since these discrepancies could compromise downstream analyses and biological reproducibility, we built a more comprehensive zebrafish transcriptome annotation that addresses these deficiencies. Our annotation improves detection of cell type-specific genes in both bulk and single cell RNA-seq datasets, where it also improves resolution of cell clustering. Thus, we demonstrate that our new transcriptome annotation can outperform existing annotations, providing an important resource for zebrafish researchers

    DEBrowser: interactive differential expression analysis and visualization tool for count data

    Get PDF
    Abstract Background Sequencing data has become a standard measure of diverse cellular activities. For example, gene expression is accurately measured by RNA sequencing (RNA-Seq) libraries, protein-DNA interactions are captured by chromatin immunoprecipitation sequencing (ChIP-Seq), protein-RNA interactions by crosslinking immunoprecipitation sequencing (CLIP-Seq) or RNA immunoprecipitation (RIP-Seq) sequencing, DNA accessibility by assay for transposase-accessible chromatin (ATAC-Seq), DNase or MNase sequencing libraries. The processing of these sequencing techniques involves library-specific approaches. However, in all cases, once the sequencing libraries are processed, the result is a count table specifying the estimated number of reads originating from each genomic locus. Differential analysis to determine which loci have different cellular activity under different conditions starts with the count table and iterates through a cycle of data assessment, preparation and analysis. Such complex analysis often relies on multiple programs and is therefore a challenge for those without programming skills. Results We developed DEBrowser as an R bioconductor project to interactively visualize every step of the differential analysis, without programming. The application provides a rich and interactive web based graphical user interface built on R’s shiny infrastructure. DEBrowser allows users to visualize data with various types of graphs that can be explored further by selecting and re-plotting any desired subset of data. Using the visualization approaches provided, users can determine and correct technical variations such as batch effects and sequencing depth that affect differential analysis. We show DEBrowser’s ease of use by reproducing the analysis of two previously published data sets. Conclusions DEBrowser is a flexible, intuitive, web-based analysis platform that enables an iterative and interactive analysis of count data without any requirement of programming knowledge

    An atlas of cell types in the mammalian epididymis and vas deferens [preprint]

    Get PDF
    Following spermatogenesis in the testis, mammalian sperm continue to mature over the course of approximately 10 days as they transit a long epithelial tube known as the epididymis. The epididymis is comprised of multiple segments/compartments that, in addition to concentrating sperm and preventing their premature activation, play key roles in remodeling the protein, lipid, and RNA composition of maturing sperm. In order to understand the complex roles for the epididymis in reproductive biology, we generated a single cell atlas of gene expression from the murine epididymis and vas deferens. We recovered all the key cell types of the epididymal epithelium, including principal cells, clear cells, and basal cells, along with associated support cells that include fibroblasts, smooth muscle, macrophages and other immune cells. Moreover, our data illuminate extensive regional specialization of principal cell populations across the length of the epididymis, with a substantial fraction of segment-specific genes localized in genomic clusters of functionally-related genes. In addition to the extensive region-specific specialization of principal cells, we find evidence for functionally-specialized subpopulations of stromal cells, and, most notably, two distinct populations of clear cells. Analysis of ligand/receptor expression reveals a network of potential cellular signaling connections, with several predicted interactions between cell types that may play roles in immune cell recruitment and other aspects of epididymal function. Our dataset extends on existing knowledge of epididymal biology, and provides a wealth of information on potential regulatory and signaling factors that bear future investigation

    Genomic Characterization of Endothelial Enhancers Reveals a Multifunctional Role for NR2F2 in Regulation of Arteriovenous Gene Expression

    No full text
    RATIONALE: Significant progress has revealed transcriptional inputs that underlie regulation of artery and vein endothelial cell fates. However, little is known concerning genome-wide regulation of this process. Therefore, such studies are warranted to address this gap. OBJECTIVE: To identify and characterize artery- and vein-specific endothelial enhancers in the human genome, thereby gaining insights into mechanisms by which blood vessel identity is regulated. METHODS AND RESULTS: Using chromatin immunoprecipitation and deep sequencing for markers of active chromatin in human arterial and venous endothelial cells, we identified several thousand artery- and vein-specific regulatory elements. Computational analysis revealed that NR2F2 (nuclear receptor subfamily 2, group F, member 2) sites were overrepresented in vein-specific enhancers, suggesting a direct role in promoting vein identity. Subsequent integration of chromatin immunoprecipitation and deep sequencing data sets with RNA sequencing revealed that NR2F2 regulated 3 distinct aspects related to arteriovenous identity. First, consistent with previous genetic observations, NR2F2 directly activated enhancer elements flanking cell cycle genes to drive their expression. Second, NR2F2 was essential to directly activate vein-specific enhancers and their associated genes. Our genomic approach further revealed that NR2F2 acts with ERG (ETS-related gene) at many of these sites to drive vein-specific gene expression. Finally, NR2F2 directly repressed only a small number of artery enhancers in venous cells to prevent their activation, including a distal element upstream of the artery-specific transcription factor, HEY2 (hes related family bHLH transcription factor with YRPW motif 2). In arterial endothelial cells, this enhancer was normally bound by ERG, which was also required for arterial HEY2 expression. By contrast, in venous endothelial cells, NR2F2 was bound to this site, together with ERG, and prevented its activation. CONCLUSIONS: By leveraging a genome-wide approach, we revealed mechanistic insights into how NR2F2 functions in multiple roles to maintain venous identity. Importantly, characterization of its role at a crucial artery enhancer upstream of HEY2 established a novel mechanism by which artery-specific expression can be achieved
    corecore