37,988 research outputs found
Automated design of bacterial genome sequences
Background:
Organisms have evolved ways of regulating transcription to better adapt to varying environments. Could the current functional genomics data and models support the possibility of engineering a genome with completely rearranged gene organization while the cell maintains its behavior under environmental challenges? How would we proceed to design a full nucleotide sequence for such genomes?
Results:
As a first step towards answering such questions, recent work showed that it is possible to design alternative transcriptomic models showing the same behavior under environmental variations than the wild-type model. A second step would require providing evidence that it is possible to provide a nucleotide sequence for a genome encoding such transcriptional model. We used computational design techniques to design a rewired global transcriptional regulation of Escherichia coli, yet showing a similar transcriptomic response than the wild-type. Afterwards, we “compiled” the transcriptional networks into nucleotide sequences to obtain the final genome sequence. Our computational evolution procedure ensures that we can maintain the genotype-phenotype mapping during the rewiring of the regulatory network. We found that it is theoretically possible to reorganize E. coli genome into 86% fewer regulated operons. Such refactored genomes are constituted by operons that contain sets of genes sharing around the 60% of their biological functions and, if evolved under highly variable environmental conditions, have regulatory networks, which turn out to respond more than 20% faster to multiple external perturbations.
Conclusions:
This work provides the first algorithm for producing a genome sequence encoding a rewired transcriptional regulation with wild-type behavior under alternative environments
Developments in the tools and methodologies of synthetic biology.
Synthetic biology is principally concerned with the rational design and engineering of biologically based parts, devices, or systems. However, biological systems are generally complex and unpredictable, and are therefore, intrinsically difficult to engineer. In order to address these fundamental challenges, synthetic biology is aiming to unify a body of knowledge from several foundational scientific fields, within the context of a set of engineering principles. This shift in perspective is enabling synthetic biologists to address complexity, such that robust biological systems can be designed, assembled, and tested as part of a biological design cycle. The design cycle takes a forward-design approach in which a biological system is specified, modeled, analyzed, assembled, and its functionality tested. At each stage of the design cycle, an expanding repertoire of tools is being developed. In this review, we highlight several of these tools in terms of their applications and benefits to the synthetic biology community
Systematic identification of gene families for use as markers for phylogenetic and phylogeny- driven ecological studies of bacteria and archaea and their major subgroups
With the astonishing rate that the genomic and metagenomic sequence data sets
are accumulating, there are many reasons to constrain the data analyses. One
approach to such constrained analyses is to focus on select subsets of gene
families that are particularly well suited for the tasks at hand. Such gene
families have generally been referred to as marker genes. We are particularly
interested in identifying and using such marker genes for phylogenetic and
phylogeny-driven ecological studies of microbes and their communities. We
therefore refer to these as PhyEco (for phylogenetic and phylogenetic ecology)
markers. The dual use of these PhyEco markers means that we needed to develop
and apply a set of somewhat novel criteria for identification of the best
candidates for such markers. The criteria we focused on included universality
across the taxa of interest, ability to be used to produce robust phylogenetic
trees that reflect as much as possible the evolution of the species from which
the genes come, and low variation in copy number across taxa. We describe here
an automated protocol for identifying potential PhyEco markers from a set of
complete genome sequences. The protocol combines rapid searching, clustering
and phylogenetic tree building algorithms to generate protein families that
meet the criteria listed above. We report here the identification of PhyEco
markers for different taxonomic levels including 40 for all bacteria and
archaea, 114 for all bacteria, and much more for some of the individual phyla
of bacteria. This new list of PhyEco markers should allow much more detailed
automated phylogenetic and phylogenetic ecology analyses of these groups than
possible previously.Comment: 24 pages, 3 figure
Recovering complete and draft population genomes from metagenome datasets.
Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution
Genetic affinities within a large global collection of pathogenic <i>Leptospira</i>: implications for strain identification and molecular epidemiology
Leptospirosis is an important zoonosis with widespread human health implications. The non-availability of accurate identification methods for the individualization of different Leptospira for outbreak investigations poses bountiful problems in the disease control arena. We harnessed fluorescent amplified fragment length polymorphism analysis (FAFLP) for Leptospira and investigated its utility in establishing genetic relationships among 271 isolates in the context of species level assignments of our global collection of isolates and strains obtained from a diverse array of hosts. In addition, this method was compared to an in-house multilocus sequence typing (MLST) method based on polymorphisms in three housekeeping genes, the rrs locus and two envelope proteins. Phylogenetic relationships were deduced based on bifurcating Neighbor-joining trees as well as median joining network analyses integrating both the FAFLP data and MLST based haplotypes. The phylogenetic relationships were also reproduced through Bayesian analysis of the multilocus sequence polymorphisms. We found FAFLP to be an important method for outbreak investigation and for clustering of isolates based on their geographical descent rather than by genome species types. The FAFLP method was, however, not able to convey much taxonomical utility sufficient to replace the highly tedious serotyping procedures in vogue. MLST, on the other hand, was found to be highly robust and efficient in identifying ancestral relationships and segregating the outbreak associated strains or otherwise according to their genome species status and, therefore, could unambiguously be applied for investigating phylogenetics of Leptospira in the context of taxonomy as well as gene flow. For instance, MLST was more efficient, as compared to FAFLP method, in clustering strains from the Andaman island of India, with their counterparts from mainland India and Sri Lanka, implying that such strains share genetic relationships and that leptospiral strains might be frequently circulating between the islands and the mainland
Development of ListeriaBase and comparative analysis of Listeria monocytogenes
Background: Listeria consists of both pathogenic and non-pathogenic species. Reports of similarities between the genomic content between some pathogenic and non-pathogenic species necessitates the investigation of these species at the genomic level to understand the evolution of virulence-associated genes. With Listeria genome data growing exponentially, comparative genomic analysis may give better insights into evolution, genetics and phylogeny of Listeria spp., leading to better management of the diseases caused by them.
Description: With this motivation, we have developed ListeriaBase, a web Listeria genomic resource and analysis platform to facilitate comparative analysis of Listeria spp. ListeriaBase currently houses 850,402 protein-coding genes, 18,113 RNAs and 15,576 tRNAs from 285 genome sequences of different Listeria strains. An AJAX-based real time search system implemented in ListeriaBase facilitates searching of this huge genomic data. Our in-house designed comparative analysis tools such as Pairwise Genome Comparison (PGC) tool allowing comparison between two genomes, Pathogenomics Profiling Tool (PathoProT) for comparing the virulence genes, and ListeriaTree for phylogenic classification, were customized and incorporated in ListeriaBase facilitating comparative genomic analysis of Listeria spp. Interestingly, we identified a unique genomic feature in the L. monocytogenes genomes in our analysis. The Auto protein sequences of the serotype 4 and the non-serotype 4 strains of L. monocytogenes possessed unique sequence signatures that can differentiate the two groups. We propose that the aut gene may be a potential gene marker for differentiating the serotype 4 strains from other serotypes of L. monocytogenes.
Conclusions: ListeriaBase is a useful resource and analysis platform that can facilitate comparative analysis of Listeria for the scientific communities. We have successfully demonstrated some key utilities of ListeriaBase. The knowledge that we obtained in the analyses of L. monocytogenes may be important for functional works of this human pathogen in future. ListeriaBase is currently available at http://listeria.um.edu.my
Recommended from our members
Clinical metagenomics.
Clinical metagenomic next-generation sequencing (mNGS), the comprehensive analysis of microbial and host genetic material (DNA and RNA) in samples from patients, is rapidly moving from research to clinical laboratories. This emerging approach is changing how physicians diagnose and treat infectious disease, with applications spanning a wide range of areas, including antimicrobial resistance, the microbiome, human host gene expression (transcriptomics) and oncology. Here, we focus on the challenges of implementing mNGS in the clinical laboratory and address potential solutions for maximizing its impact on patient care and public health
Two intracellular and cell type-specific bacterial symbionts in the placozoan Trichoplax H2
Placozoa is an enigmatic phylum of simple, microscopic, marine metazoans(1,2). Although intracellular bacteria have been found in all members of this phylum, almost nothing is known about their identity, location and interactions with their host(3-6). We used metagenomic and metatranscriptomic sequencing of single host individuals, plus metaproteomic and imaging analyses, to show that the placozoan Trichoplax sp. H2 lives in symbiosis with two intracellular bacteria. One symbiont forms an undescribed genus in the Midichloriaceae (Rickettsiales)(7,8) and has a genomic repertoire similar to that of rickettsial parasites(9,10), but does not seem to express key genes for energy parasitism. Correlative image analyses and three-dimensional electron tomography revealed that this symbiont resides in the rough endoplasmic reticulum of its host's internal fibre cells. The second symbiont belongs to the Margulisbacteria, a phylum without cultured representatives and not known to form intracellular associations(11-13). This symbiont lives in the ventral epithelial cells of Trichoplax, probably metabolizes algal lipids digested by its host and has the capacity to supplement the placozoan's nutrition. Our study shows that one of the simplest animals has evolved highly specific and intimate associations with symbiotic, intracellular bacteria and highlights that symbioses can provide access to otherwise elusive microbial dark matter
Species Identification and Profiling of Complex Microbial Communities Using Shotgun Illumina Sequencing of 16S rRNA Amplicon Sequences
The high throughput and cost-effectiveness afforded by short-read sequencing
technologies, in principle, enable researchers to perform 16S rRNA profiling of
complex microbial communities at unprecedented depth and resolution. Existing
Illumina sequencing protocols are, however, limited by the fraction of the 16S
rRNA gene that is interrogated and therefore limit the resolution and quality
of the profiling. To address this, we present the design of a novel protocol
for shotgun Illumina sequencing of the bacterial 16S rRNA gene, optimized to
capture more than 90% of sequences in the Greengenes database and with nearly
twice the resolution of existing protocols. Using several in silico and
experimental datasets, we demonstrate that despite the presence of multiple
variable and conserved regions, the resulting shotgun sequences can be used to
accurately quantify the diversity of complex microbial communities. The
reconstruction of a significant fraction of the 16S rRNA gene also enabled high
precision (>90%) in species-level identification thereby opening up potential
application of this approach for clinical microbial characterization.Comment: 17 pages, 2 tables, 2 figures, supplementary materia
- …