132 research outputs found

    TRY plant trait database - enhanced coverage and open access

    Get PDF
    Plant traits-the morphological, anatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives

    Structural basis for inhibition of homologous recombination by the RecX protein

    Get PDF
    The RecA/RAD51 nucleoprotein filament is central to the reaction of homologous recombination (HR). Filament activity must be tightly regulated in vivo as unrestrained HR can cause genomic instability. Our mechanistic understanding of HR is restricted by lack of structural information about the regulatory proteins that control filament activity. Here, we describe a structural and functional analysis of the HR inhibitor protein RecX and its mode of interaction with the RecA filament. RecX is a modular protein assembled of repeated three-helix motifs. The relative arrangement of the repeats generates an elongated and curved shape that is well suited for binding within the helical groove of the RecA filament. Structure-based mutagenesis confirms that conserved basic residues on the concave side of RecX are important for repression of RecA activity. Analysis of RecA filament dynamics in the presence of RecX shows that RecX actively promotes filament disassembly. Collectively, our data support a model in which RecX binding to the helical groove of the filament causes local dissociation of RecA protomers, leading to filament destabilisation and HR inhibition

    Nephele: genotyping via complete composition vectors and MapReduce

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Current sequencing technology makes it practical to sequence many samples of a given organism, raising new challenges for the processing and interpretation of large genomics data sets with associated metadata. Traditional computational phylogenetic methods are ideal for studying the evolution of gene/protein families and using those to infer the evolution of an organism, but are less than ideal for the study of the whole organism mainly due to the presence of insertions/deletions/rearrangements. These methods provide the researcher with the ability to group a set of samples into distinct genotypic groups based on sequence similarity, which can then be associated with metadata, such as host information, pathogenicity, and time or location of occurrence. Genotyping is critical to understanding, at a genomic level, the origin and spread of infectious diseases. Increasingly, genotyping is coming into use for disease surveillance activities, as well as for microbial forensics. The classic genotyping approach has been based on phylogenetic analysis, starting with a multiple sequence alignment. Genotypes are then established by expert examination of phylogenetic trees. However, these traditional single-processor methods are suboptimal for rapidly growing sequence datasets being generated by next-generation DNA sequencing machines, because they increase in computational complexity quickly with the number of sequences.</p> <p>Results</p> <p>Nephele is a suite of tools that uses the complete composition vector algorithm to represent each sequence in the dataset as a vector derived from its constituent k-mers by passing the need for multiple sequence alignment, and affinity propagation clustering to group the sequences into genotypes based on a distance measure over the vectors. Our methods produce results that correlate well with expert-defined clades or genotypes, at a fraction of the computational cost of traditional phylogenetic methods run on traditional hardware. Nephele can use the open-source Hadoop implementation of MapReduce to parallelize execution using multiple compute nodes. We were able to generate a neighbour-joined tree of over 10,000 16S samples in less than 2 hours.</p> <p>Conclusions</p> <p>We conclude that using Nephele can substantially decrease the processing time required for generating genotype trees of tens to hundreds of organisms at genome scale sequence coverage.</p

    Oscillations by Minimal Bacterial Suicide Circuits Reveal Hidden Facets of Host-Circuit Physiology

    Get PDF
    Synthetic biology seeks to enable programmed control of cellular behavior though engineered biological systems. These systems typically consist of synthetic circuits that function inside, and interact with, complex host cells possessing pre-existing metabolic and regulatory networks. Nevertheless, while designing systems, a simple well-defined interface between the synthetic gene circuit and the host is frequently assumed. We describe the generation of robust but unexpected oscillations in the densities of bacterium Escherichia coli populations by simple synthetic suicide circuits containing quorum components and a lysis gene. Contrary to design expectations, oscillations required neither the quorum sensing genes (luxR and luxI) nor known regulatory elements in the PluxI promoter. Instead, oscillations were likely due to density-dependent plasmid amplification that established a population-level negative feedback. A mathematical model based on this mechanism captures the key characteristics of oscillations, and model predictions regarding perturbations to plasmid amplification were experimentally validated. Our results underscore the importance of plasmid copy number and potential impact of “hidden interactions” on the behavior of engineered gene circuits - a major challenge for standardizing biological parts. As synthetic biology grows as a discipline, increasing value may be derived from tools that enable the assessment of parts in their final context

    More Than 1,001 Problems with Protein Domain Databases: Transmembrane Regions, Signal Peptides and the Issue of Sequence Homology

    Get PDF
    Large-scale genome sequencing gained general importance for life science because functional annotation of otherwise experimentally uncharacterized sequences is made possible by the theory of biomolecular sequence homology. Historically, the paradigm of similarity of protein sequences implying common structure, function and ancestry was generalized based on studies of globular domains. Having the same fold imposes strict conditions over the packing in the hydrophobic core requiring similarity of hydrophobic patterns. The implications of sequence similarity among non-globular protein segments have not been studied to the same extent; nevertheless, homology considerations are silently extended for them. This appears especially detrimental in the case of transmembrane helices (TMs) and signal peptides (SPs) where sequence similarity is necessarily a consequence of physical requirements rather than common ancestry. Thus, matching of SPs/TMs creates the illusion of matching hydrophobic cores. Therefore, inclusion of SPs/TMs into domain models can give rise to wrong annotations. More than 1001 domains among the 10,340 models of Pfam release 23 and 18 domains of SMART version 6 (out of 809) contain SP/TM regions. As expected, fragment-mode HMM searches generate promiscuous hits limited to solely the SP/TM part among clearly unrelated proteins. More worryingly, we show explicit examples that the scores of clearly false-positive hits, even in global-mode searches, can be elevated into the significance range just by matching the hydrophobic runs. In the PIR iProClass database v3.74 using conservative criteria, we find that at least between 2.1% and 13.6% of its annotated Pfam hits appear unjustified for a set of validated domain models. Thus, false-positive domain hits enforced by SP/TM regions can lead to dramatic annotation errors where the hit has nothing in common with the problematic domain model except the SP/TM region itself. We suggest a workflow of flagging problematic hits arising from SP/TM-containing models for critical reconsideration by annotation users

    Advancing brain barriers RNA sequencing: guidelines from experimental design to publication

    Get PDF
    Background: RNA sequencing (RNA-Seq) in its varied forms has become an indispensable tool for analyzing differential gene expression and thus characterization of specific tissues. Aiming to understand the brain barriers genetic signature, RNA seq has also been introduced in brain barriers research. This has led to availability of both, bulk and single-cell RNA-Seq datasets over the last few years. If appropriately performed, the RNA-Seq studies provide powerful datasets that allow for significant deepening of knowledge on the molecular mechanisms that establish the brain barriers. However, RNA-Seq studies comprise complex workflows that require to consider many options and variables before, during and after the proper sequencing process.Main body: In the current manuscript, we build on the interdisciplinary experience of the European PhD Training Network BtRAIN (https://www.btrain-2020.eu/) where bioinformaticians and brain barriers researchers collaborated to analyze and establish RNA-Seq datasets on vertebrate brain barriers. The obstacles BtRAIN has identified in this process have been integrated into the present manuscript. It provides guidelines along the entire workflow of brain barriers RNA-Seq studies starting from the overall experimental design to interpretation of results. Focusing on the vertebrate endothelial blood–brain barrier (BBB) and epithelial blood-cerebrospinal-fluid barrier (BCSFB) of the choroid plexus, we provide a step-by-step description of the workflow, highlighting the decisions to be made at each step of the workflow and explaining the strengths and weaknesses of individual choices made. Finally, we propose recommendations for accurate data interpretation and on the information to be included into a publication to ensure appropriate accessibility of the data and reproducibility of the observations by the scientific community.Conclusion: Next generation transcriptomic profiling of the brain barriers provides a novel resource for understanding the development, function and pathology of these barrier cells, which is essential for understanding CNS homeostasis and disease. Continuous advancement and sophistication of RNA-Seq will require interdisciplinary approaches between brain barrier researchers and bioinformaticians as successfully performed in BtRAIN. The present guidelines are built on the BtRAIN interdisciplinary experience and aim to facilitate collaboration of brain barriers researchers with bioinformaticians to advance RNA-Seq study design in the brain barriers community

    TRY plant trait database - enhanced coverage and open access

    Get PDF
    Plant traits—the morphological, anatomical, physiological, biochemical and phenological characteristics of plants—determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait‐based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits—almost complete coverage for ‘plant growth form’. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait–environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives

    SATB1 Mediates Long-Range Chromatin Interactions: A Dual Regulator of Anti-Apoptotic BCL2 and Pro-Apoptotic NOXA Genes

    Get PDF
    We thank Ms. Kathy Kyler for her kind help in English editing of the manuscript.Aberrant expression of special AT-rich binding protein 1 (SATB1), a global genomic organizer, has been associated with various cancers, which raises the question of how higher-order chromatin structure contributes to carcinogenesis. Disruption of apoptosis is one of the hallmarks of cancer. We previously demonstrated that SATB1 mediated specific long-range chromosomal interactions between the mbr enhancer located within 3’-UTR of the BCL2 gene and the promoter to regulate BCL2 expression during early apoptosis. In the present study, we used chromosome conformation capture (3C) assays and molecular analyses to further investigate the function of the SATB1-mediated higher-order chromatin structure in co-regulation of the anti-apoptotic BCL2 gene and the pro-apoptotic NOXA gene located 3.4Mb downstream on Chromosome 18. We demonstrated that the mbr enhancer spatially juxtaposed the promoters of BCL2 and NOXA genes through SATB1-mediated chromatin-loop in Jurkat cells. Decreased SATB1 levels switched the mbr-BCL2 loop to mbr-NOXA loop, and thus changed expression of these two genes. The SATB1-mediated dynamic switch of the chromatin loop structures was essential for the cooperative expression of the BCL2 and NOXA genes in apoptosis. Notably, the role of SATB1 was specific, since inhibition of SATB1 degradation by caspase-6 inhibitor or caspase-6-resistant SATB1 mutant reversed expression of BCL-2 and NOXA in response to apoptotic stimulation. This study reveals the critical role of SATB1-organized higher-order chromatin structure in regulating the dynamic equilibrium of apoptosis-controlling genes with antagonistic functions and suggests that aberrant SATB1 expression might contribute to cancer development by disrupting the co-regulated genes in apoptosis pathways.Yeshttp://www.plosone.org/static/editorial#pee
    corecore