219 research outputs found

    Explosive Percolation in the Human Protein Homology Network

    Full text link
    We study the explosive character of the percolation transition in a real-world network. We show that the emergence of a spanning cluster in the Human Protein Homology Network (H-PHN) exhibits similar features to an Achlioptas-type process and is markedly different from regular random percolation. The underlying mechanism of this transition can be described by slow-growing clusters that remain isolated until the later stages of the process, when the addition of a small number of links leads to the rapid interconnection of these modules into a giant cluster. Our results indicate that the evolutionary-based process that shapes the topology of the H-PHN through duplication-divergence events may occur in sudden steps, similarly to what is seen in first-order phase transitions.Comment: 13 pages, 6 figure

    Electrochemical integration of graphene with light absorbing copper-based thin films

    Full text link
    We present an electrochemical route for the integration of graphene with light sensitive copper-based alloys used in optoelectronic applications. Graphene grown using chemical vapor deposition (CVD) transferred to glass is found to be a robust substrate on which photoconductive Cu_{x}S films of 1-2 um thickness can be deposited. The effect of growth parameters on the morphology and photoconductivity of Cu_{x}S films is presented. Current-voltage characterization and photoconductivity decay experiments are performed with graphene as one contact and silver epoxy as the other

    Species-level functional profiling of metagenomes and metatranscriptomes.

    Get PDF
    Functional profiles of microbial communities are typically generated using comprehensive metagenomic or metatranscriptomic sequence read searches, which are time-consuming, prone to spurious mapping, and often limited to community-level quantification. We developed HUMAnN2, a tiered search strategy that enables fast, accurate, and species-resolved functional profiling of host-associated and environmental communities. HUMAnN2 identifies a community's known species, aligns reads to their pangenomes, performs translated search on unclassified reads, and finally quantifies gene families and pathways. Relative to pure translated search, HUMAnN2 is faster and produces more accurate gene family profiles. We applied HUMAnN2 to study clinal variation in marine metabolism, ecological contribution patterns among human microbiome pathways, variation in species' genomic versus transcriptional contributions, and strain profiling. Further, we introduce 'contributional diversity' to explain patterns of ecological assembly across different microbial community types

    Efficient oligonucleotide probe selection for pan-genomic tiling arrays

    Get PDF
    Background: Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome. Results: This paper presents a new probe selection algorithm (PanArray) that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pangenome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage. Conclusion: PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on a single microarray chip. These unique pan-genome tiling arrays provide maximum flexibility for the analysis of both known and uncharacterized strains.https://doi.org/10.1186/1471-2105-10-29

    Large-scale comparative genomic ranking of taxonomically restricted genes (TRGs) in bacterial and archaeal genomes

    Get PDF
    BACKGROUND: Lineage-specific, or taxonomically restricted genes (TRGs), especially those that are species and strain-specific, are of special interest because they are expected to play a role in defining exclusive ecological adaptations to particular niches. Despite this, they are relatively poorly studied and little understood, in large part because many are still orphans or only have homologues in very closely related isolates. This lack of homology confounds attempts to establish the likelihood that a hypothetical gene is expressed and, if so, to determine the putative function of the protein. METHODOLOGY/PRINCIPAL FINDINGS: We have developed "QIPP" ("Quality Index for Predicted Proteins"), an index that scores the "quality" of a protein based on non-homology-based criteria. QIPP can be used to assign a value between zero and one to any protein based on comparing its features to other proteins in a given genome. We have used QIPP to rank the predicted proteins in the proteomes of Bacteria and Archaea. This ranking reveals that there is a large amount of variation in QIPP scores, and identifies many high-scoring orphans as potentially "authentic" (expressed) orphans. There are significant differences in the distributions of QIPP scores between orphan and non-orphan genes for many genomes and a trend for less well-conserved genes to have lower QIPP scores. CONCLUSIONS: The implication of this work is that QIPP scores can be used to further annotate predicted proteins with information that is independent of homology. Such information can be used to prioritize candidates for further analysis. Data generated for this study can be found in the OrphanMine at http://www.genomics.ceh.ac.uk/orphan_mine

    Mir-132/212 is required for maturation of binocular matching of orientation preference and depth perception

    Get PDF
    MicroRNAs (miRNAs) are known to mediate post-transcriptional gene regulation, but their role in postnatal brain development is still poorly explored. We show that the expression of many miRNAs is dramatically regulated during functional maturation of the mouse visual cortex with miR-132/212 family being one of the top upregulated miRNAs. Age-downregulated transcripts are significantly enriched in miR-132/miR-212 putative targets and in genes upregulated in miR-132/212 null mice. At a functional level, miR-132/212 deletion affects development of receptive fields of cortical neurons determining a specific impairment of binocular matching of orientation preference, but leaving orientation and direction selectivity unaltered. This deficit is associated with reduced depth perception in the visual cliff test. Deletion of miR-132/212 from forebrain excitatory neurons replicates the binocular matching deficits. Thus, miR-132/212 family shapes the age-dependent transcriptome of the visual cortex during a specific developmental window resulting in maturation of binocular cortical cells and depth perception

    BPGA- an ultra-fast pan-genome analysis pipeline

    Get PDF
    Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique (strain-specific) gene pool of a species, tracing horizontal gene-flux across strains and providing insight into species evolution. The existing pan genome software tools suffer from various limitations like limited datasets, difficult installation/requirements, inadequate functional features etc. Here we present an ultra-fast computational pipeline BPGA (Bacterial Pan Genome Analysis tool) with seven functional modules. In addition to the routine pan genome analyses, BPGA introduces a number of novel features for downstream analyses like core/pan/MLST (Multi Locus Sequence Typing) phylogeny, exclusive presence/absence of genes in specific strains, subset analysis, atypical G + C content analysis and KEGG & COG mapping of core, accessory and unique genes. Other notable features include minimum running prerequisites, freedom to select the gene clustering method, ultra-fast execution, user friendly command line interface and high-quality graphics outputs. The performance of BPGA has been evaluated using a dataset of complete genome sequences of 28 Streptococcus pyogenes strains

    Exploring the symbiotic pangenome of the nitrogen-fixing bacterium Sinorhizobium meliloti

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Sinorhizobium meliloti </it>is a model system for the studies of symbiotic nitrogen fixation. An extensive polymorphism at the genetic and phenotypic level is present in natural populations of this species, especially in relation with symbiotic promotion of plant growth. AK83 and BL225C are two nodule-isolated strains with diverse symbiotic phenotypes; BL225C is more efficient in promoting growth of the <it>Medicago sativa </it>plants than strain AK83. In order to investigate the genetic determinants of the phenotypic diversification of <it>S. meliloti </it>strains AK83 and BL225C, we sequenced the complete genomes for these two strains.</p> <p>Results</p> <p>With sizes of 7.14 Mbp and 6.97 Mbp, respectively, the genomes of AK83 and BL225C are larger than the laboratory strain Rm1021. The core genome of Rm1021, AK83, BL225C strains included 5124 orthologous groups, while the accessory genome was composed by 2700 orthologous groups. While Rm1021 and BL225C have only three replicons (Chromosome, pSymA and pSymB), AK83 has also two plasmids, 260 and 70 Kbp long. We found 65 interesting orthologous groups of genes that were present only in the accessory genome, consequently responsible for phenotypic diversity and putatively involved in plant-bacterium interaction. Notably, the symbiosis inefficient AK83 lacked several genes required for microaerophilic growth inside nodules, while several genes for accessory functions related to competition, plant invasion and bacteroid tropism were identified only in AK83 and BL225C strains. Presence and extent of polymorphism in regulons of transcription factors involved in symbiotic interaction were also analyzed. Our results indicate that regulons are flexible, with a large number of accessory genes, suggesting that regulons polymorphism could also be a key determinant in the variability of symbiotic performances among the analyzed strains.</p> <p>Conclusions</p> <p>In conclusions, the extended comparative genomics approach revealed a variable subset of genes and regulons that may contribute to the symbiotic diversity.</p

    EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    Get PDF
    Blom J, Albaum S, Doppmeier D, et al. EDGAR: a software framework for the comparative analysis of prokaryotic genomes. BMC Bioinformatics. 2009;10(1): 154.Background:The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results: To support these studies EDGAR – ''Efficient Database framework for comparative Genome Analyses using BLAST score Ratios'' – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion: EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface http://edgar.cebitec.uni-bielefeld.de webcite, where the precomputed data sets can be browsed

    Maize (Zea mays L.) Genome Diversity as Revealed by RNA-Sequencing

    Get PDF
    Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq) is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP) detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%). However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis
    corecore