8 research outputs found

    GeneFEAST: the pivotal, gene-centric step in functional enrichment analysis interpretation

    Full text link
    Summary: GeneFEAST, implemented in Python, is a gene-centric functional enrichment analysis summarisation and visualisation tool that can be applied to large functional enrichment analysis (FEA) results arising from upstream FEA pipelines. It produces a systematic, navigable HTML report, making it easy to identify sets of genes putatively driving multiple enrichments and to explore gene-level quantitative data first used to identify input genes. Further, GeneFEAST can compare FEA results from multiple studies, making it possible, for example, to highlight patterns of gene expression amongst genes commonly differentially expressed in two sets of conditions, and giving rise to shared enrichments under those conditions. GeneFEAST offers a novel, effective way to address the complexities of linking up many overlapping FEA results to their underlying genes and data, advancing gene-centric hypotheses, and providing pivotal information for downstream validation experiments. Availability: GeneFEAST is available at https://github.com/avigailtaylor/GeneFEAST Contact: [email protected]: Main text: 3 pages, 1 figure. Supplementary Information: 16 pages, 3 figures, 2 tables, 4 boxe

    In-Depth Genomic and Phenotypic Characterization of the Antarctic Psychrotolerant Strain Pseudomonas sp. MPC6 Reveals Unique Metabolic Features, Plasticity, and Biotechnological Potential

    Get PDF
    We obtained the complete genome sequence of the psychrotolerant extremophile Pseudomonas sp. MPC6, a natural Polyhydroxyalkanoates (PHAs) producing bacterium able to rapidly grow at low temperatures. Genomic and phenotypic analyses allowed us to situate this isolate inside the Pseudomonas fluorescens phylogroup of pseudomonads as well as to reveal its metabolic versatility and plasticity. The isolate possesses the gene machinery for metabolizing a variety of toxic aromatic compounds such as toluene, phenol, chloroaromatics, and TNT. In addition, it can use both C6- and C5-carbon sugars like xylose and arabinose as carbon substrates, an uncommon feature for bacteria of this genus. Furthermore, Pseudomonas sp. MPC6 exhibits a high-copy number of genes encoding for enzymes involved in oxidative and cold-stress response that allows it to cope with high concentrations of heavy metals (As, Cd, Cu) and low temperatures, a finding that was further validated experimentally. We then assessed the growth performance of MPC6 on glycerol using a temperature range from 0 to 45°C, the latter temperature corresponding to the limit at which this Antarctic isolate was no longer able to propagate. On the other hand, the MPC6 genome comprised considerably less virulence and drug resistance factors as compared to pathogenic Pseudomonas strains, thus supporting its safety. Unexpectedly, we found five PHA synthases within the genome of MPC6, one of which clustered separately from the other four. This PHA synthase shared only 40% sequence identity at the amino acid level against the only PHA polymerase described for Pseudomonas (63-1 strain) able to produce copolymers of short- and medium-chain length PHAs. Batch cultures for PHA synthesis in Pseudomonas sp. MPC6 using sugars, decanoate, ethylene glycol, and organic acids as carbon substrates result in biopolymers with different monomer compositions. This indicates that the PHA synthases play a critical role in defining not only the final chemical structure of the biosynthesized PHA, but also the employed biosynthetic pathways. Based on the results obtained, we conclude that Pseudomonas sp. MPC6 can be exploited as a bioremediator and biopolymer factory, as well as a model strain to unveil molecular mechanisms behind adaptation to cold and extreme environments

    Whole Exome Sequencing Reveals the Major Genetic Contributors to Nonsyndromic Tetralogy of Fallot

    Get PDF
    Rationale: Familial recurrence studies provide strong evidence for a genetic component to the predisposition to sporadic, nonsyndromic Tetralogy of Fallot (TOF), the most common cyanotic congenital heart disease phenotype. Rare genetic variants have been identified as important contributors to the risk of congenital heart disease, but relatively small numbers of TOF cases have been studied to date. Objective: We used whole exome sequencing to assess the prevalence of unique, deleterious variants in the largest cohort of nonsyndromic TOF patients reported to date. Methods and Results: Eight hundred twenty-nine TOF patients underwent whole exome sequencing. The presence of unique, deleterious variants was determined; defined by their absence in the Genome Aggregation Database and a scaled combined annotation-dependent depletion score of ≥20. The clustering of variants in 2 genes, NOTCH1 and FLT4, surpassed thresholds for genome-wide significance (assigned as P<5×10−8) after correction for multiple comparisons. NOTCH1 was most frequently found to harbor unique, deleterious variants. Thirty-one changes were observed in 37 probands (4.5%; 95% CI, 3.2%–6.1%) and included 7 loss-of-function variants 22 missense variants and 2 in-frame indels. Sanger sequencing of the unaffected parents of 7 cases identified 5 de novo variants. Three NOTCH1 variants (p.G200R, p.C607Y, and p.N1875S) were subjected to functional evaluation, and 2 showed a reduction in Jagged1-induced NOTCH signaling. FLT4 variants were found in 2.4% (95% CI, 1.6%–3.8%) of TOF patients, with 21 patients harboring 22 unique, deleterious variants. The variants identified were distinct to those that cause the congenital lymphoedema syndrome Milroy disease. In addition to NOTCH1, FLT4 and the well-established TOF gene, TBX1, we identified potential association with variants in several other candidates, including RYR1, ZFPM1, CAMTA2, DLX6, and PCM1. Conclusions: The NOTCH1 locus is the most frequent site of genetic variants predisposing to nonsyndromic TOF, followed by FLT4. Together, variants in these genes are found in almost 7% of TOF patients

    Evaluation of computational methods for human microbiome analysis using simulated data.

    Get PDF
    Indexación: Scopus.Background: Our understanding of the composition, function, and health implications of human microbiota has been advanced by high-throughput sequencing and the development of new genomic analyses. However, trade-offs among alternative strategies for the acquisition and analysis of sequence data remain understudied. Methods: We assessed eight popular taxonomic profiling pipelines; MetaPhlAn2, metaMix, PathoScope 2.0, Sigma, Kraken, ConStrains, Centrifuge and Taxator-tk, against a battery of metagenomic datasets simulated from real data. The metagenomic datasets were modeled on 426 complete or permanent draft genomes stored in the Human Oral Microbiome Database and were designed to simulate various experimental conditions, both in the design of a putative experiment; read length (75-1,000 bp reads), sequence depth (100K-10M), and in metagenomic composition; number of species present (10, 100, 426), species distribution. The sensitivity and specificity of each of the pipelines under various scenarios were measured. We also estimated the relative root mean square error and average relative error to assess the abundance estimates produced by different methods. Additional datasets were generated for five of the pipelines to simulate the presence within a metagenome of an unreferenced species, closely related to other referenced species. Additional datasets were also generated in order to measure computational time on datasets of ever-increasing sequencing depth (up to 6 × 107). Results: Testing of eight pipelines against 144 simulated metagenomic datasets initially produced 1,104 discrete results. Pipelines using a marker gene strategy; MetaPhlAn2 and ConStrains, were overall less sensitive, than other pipelines; with the notable exception of Taxator-tk. This difference in sensitivity was largely made up in terms of runtime, significantly lower than more sensitive pipelines that rely on whole-genome alignments such as PathoScope2.0. However, pipelines that used strategies to speed-up alignment between genomic references and metagenomic reads, such as kmerization, were able to combine both high sensitivity and low run time, as is the case with Kraken and Centrifuge. Absent species genomes in the database mostly led to assignment of reads to the most closely related species available in all pipelines. Our results therefore suggest that taxonomic profilers that use kmerization have largely superseded those that use gene markers, coupling low run times with high sensitivity and specificity. Taxonomic profilers using more time-consuming read reassignment, such as PathoScope 2.0, provided the most sensitive profiles under common metagenomic sequencing scenarios.https://peerj.com/articles/9688/

    Genome sequence of two members of the chloroaromatic-degrading MT community : Pseudomonas reinekei MT1 and Achromobacter xylosoxidans MT3

    No full text
    We describe the genome sequence of Pseudomonas reinekei MT1 and Achromobacter xylosoxidans MT3, the most abundant members of a bacterial community capable of degrading chloroaromatic compounds. The MT1 genome contains open reading frames encoding enzymes responsible for the catabolism of chlorosalicylate, methylsalicylate, chlorophenols, phenol, benzoate, p-coumarate, phenylalanine, and phenylacetate. On the other hand, the MT3 strain genome possesses no ORFs to metabolize chlorosalicylates; instead the bacterium is capable of metabolizing nitro-phenolic and phenolic compounds, which can be used as the only carbon and energy source by MT3. We also confirmed that MT3 displays the genetic machinery for the metabolism of chlorocathecols and chloromuconates, where the latter are toxic compounds secreted by MT1 when degrading chlorosalicylates. Altogether, this work will advance our fundamental understanding of bacterial interactions

    OmniScope: a Computational Pipeline for Metagenomic Species Identification Using Reference and de novo Assembly

    No full text
    Metagenomics has revolutionized the field of microbiology and promises to impact clinical practice as well. While the number of genomes available for reference-based metagenomic pathogen identification keeps increasing, it is still difficult to classify most of the reads from a metagenomic experiment due to intra-species diversity and uncharacterized pathogens. Here, we propose to combine reference-based metagenomic profiling (faster) with de novo metagenomic assembly (more accurate) to maximize the number of used reads and allow for the discovery of novel species in the data that are not identified by reference-based methods. We take advantage of the fact that homologous sequences among related but different species form detectable peaks in coverage. Reads belonging to those peaks are then extracted and assembled into contigs. Finally, using a de novo strategy that involves storing the DeBruijn graph in bloom filters, we take the unmapped reads and, together with the contigs, create a hybrid assembly that increases the number of species discovered. We provide a proof of concept and discuss potential applications both for clinical and environmental samples. Test data and code is freely available in GitHub at www.github.com/mjmiossec/omniscope

    Mitotic DNA synthesis is caused by transcription-replication conflicts in BRCA2-deficient cells

    No full text
    Aberrant replication causes cells lacking BRCA2 to enter mitosis with under-replicated DNA, which activates a repair mechanism known as mitotic DNA synthesis (MiDAS). Here, we identify genome-wide the sites where MiDAS reactions occur when BRCA2 is abrogated. High-resolution profiling revealed that these sites are different from MiDAS at aphidicolin-induced common fragile sites in that they map to genomic regions replicating in the early S-phase, which are close to early-firing replication origins, are highly transcribed, and display R-loop-forming potential. Both transcription inhibition in early S-phase and RNaseH1 overexpression reduced MiDAS in BRCA2-deficient cells, indicating that transcription-replication conflicts (TRCs) and R-loops are the source of MiDAS. Importantly, the MiDAS sites identified in BRCA2-deficient cells also represent hotspots for genomic rearrangements in BRCA2-mutated breast tumors. Thus, our work provides a mechanism for how tumor-predisposing BRCA2 inactivation links transcription-induced DNA damage with mitotic DNA repair to fuel the genomic instability characteristic of cancer cells
    corecore