264 research outputs found

    Low-pass sequencing for microbial comparative genomics

    Get PDF
    BACKGROUND: We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. RESULTS: As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. CONCLUSION: Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich genome of H. sp. NRC-1. Identification of multiple TBP and TFB homologs in these four halophiles are consistent with the hypothesis that different types of complex transcriptional regulation may occur through multiple TBP-TFB combinations in response to rapidly changing environmental conditions. Low-pass shotgun sequence analyses of genomes permit extensive and diverse analyses, and should be generally useful for comparative microbial genomics

    Pattern of the Divergence of Olfactory Receptor Genes during Tetrapod Evolution

    Get PDF
    The olfactory receptor (OR) multigene family is responsible for the sense of smell in vertebrate species. OR genes are scattered widely in our chromosomes and constitute one of the largest gene families in eutherian genomes. Some previous studies revealed that eutherian OR genes diverged mainly during early mammalian evolution. However, the exact period when, and the ecological reason why eutherian ORs strongly diverged has remained unclear. In this study, I performed a strict data mining effort for marsupial opossum OR sequences and bootstrap analyses to estimate the periods of chromosomal migrations and gene duplications of OR genes during tetrapod evolution. The results indicate that chromosomal migrations occurred mainly during early vertebrate evolution before the monotreme-placental split, and that gene duplications occurred mainly during early mammalian evolution between the bird-mammal split and marsupial-placental split, coinciding with the reduction of opsin genes in primitive mammals. It could be thought that the previous chromosomal dispersal allowed the OR genes to subsequently expand easily, and the nocturnal adaptation of early mammals might have triggered the OR gene expansion

    Genomic architecture of inflammatory bowel disease in five families with multiple affected individuals.

    Full text link
    Currently, the best clinical predictor for inflammatory bowel disease (IBD) is family history. Over 163 sequence variants have been associated with IBD in genome-wide association studies, but they have weak effects and explain only a fraction of the observed heritability. It is expected that additional variants contribute to the genomic architecture of IBD, possibly including rare variants with effect sizes larger than the identified common variants. Here we applied a family study design and sequenced 38 individuals from five families, under the hypothesis that families with multiple IBD-affected individuals harbor one or more risk variants that (i) are shared among affected family members, (ii) are rare and (iii) have substantial effect on disease development. Our analysis revealed not only novel candidate risk variants but also high polygenic risk scores for common known risk variants in four out of the five families. Functional analysis of our top novel variant in the remaining family, a rare missense mutation in the ubiquitin ligase TRIM11, suggests that it leads to increased nuclear factor of kappa light chain enhancer in B-cells (NF-κB) signaling. We conclude that an accumulation of common weak-effect variants accounts for the high incidence of IBD in most, but not all families we analyzed and that a family study design can identify novel rare variants conferring risk for IBD with potentially large effect size, such as the TRIM11 p.H414Y mutation

    A wellness study of 108 individuals using personal, dense, dynamic data clouds.

    Get PDF
    Personal data for 108 individuals were collected during a 9-month period, including whole genome sequences; clinical tests, metabolomes, proteomes, and microbiomes at three time points; and daily activity tracking. Using all of these data, we generated a correlation network that revealed communities of related analytes associated with physiology and disease. Connectivity within analyte communities enabled the identification of known and candidate biomarkers (e.g., gamma-glutamyltyrosine was densely interconnected with clinical analytes for cardiometabolic disease). We calculated polygenic scores from genome-wide association studies (GWAS) for 127 traits and diseases, and used these to discover molecular correlates of polygenic risk (e.g., genetic risk for inflammatory bowel disease was negatively correlated with plasma cystine). Finally, behavioral coaching informed by personal data helped participants to improve clinical biomarkers. Our results show that measurement of personal data clouds over time can improve our understanding of health and disease, including early transitions to disease states

    Extensive Gains and Losses of Olfactory Receptor Genes in Mammalian Evolution

    Get PDF
    Odor perception in mammals is mediated by a large multigene family of olfactory receptor (OR) genes. The number of OR genes varies extensively among different species of mammals, and most species have a substantial number of pseudogenes. To gain some insight into the evolutionary dynamics of mammalian OR genes, we identified the entire set of OR genes in platypuses, opossums, cows, dogs, rats, and macaques and studied the evolutionary change of the genes together with those of humans and mice. We found that platypuses and primates have <400 functional OR genes while the other species have 800–1,200 functional OR genes. We then estimated the numbers of gains and losses of OR genes for each branch of the phylogenetic tree of mammals. This analysis showed that (i) gene expansion occurred in the placental lineage each time after it diverged from monotremes and from marsupials and (ii) hundreds of gains and losses of OR genes have occurred in an order-specific manner, making the gene repertoires highly variable among different orders. It appears that the number of OR genes is determined primarily by the functional requirement for each species, but once the number reaches the required level, it fluctuates by random duplication and deletion of genes. This fluctuation seems to have been aided by the stochastic nature of OR gene expression

    A Framework for Exploring Functional Variability in Olfactory Receptor Genes

    Get PDF
    BACKGROUND: Olfactory receptors (ORs) are the largest gene family in mammalian genomes. Since nearly all OR genes are orphan receptors, inference of functional similarity or differences between odorant receptors typically relies on sequence comparisons. Based on the alignment of entire coding region sequence, OR genes are classified into families and subfamilies, a classification that is believed to be a proxy for OR gene functional variability. However, the assumption that overall protein sequence diversity is a good proxy for functional properties is untested. METHODOLOGY: Here, we propose an alternative sequence-based approach to infer the similarities and differences in OR binding capacity. Our approach is based on similarities and differences in the predicted binding pockets of OR genes, rather than on the entire OR coding region. CONCLUSIONS: Interestingly, our approach yields markedly different results compared to the analysis based on the entire OR coding-regions. While neither approach can be tested at this time, the discrepancy between the two calls into question the assumption that the current classification reliably reflects OR gene functional variability

    Rigorous and thorough bioinformatic analyses of olfactory receptor promoters confirm enrichment of O/E and homeodomain binding sites but reveal no new common motifs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mammalian olfactory receptors (ORs) are subject to a remarkable but poorly understood regime of transcriptional regulation, whereby individual olfactory neurons each express only one allele of a single member of the large OR gene family.</p> <p>Results</p> <p>We performed a rigorous search for enriched sequence motifs in the largest dataset of OR promoter regions analyzed to date. We combined measures of cross-species conservation with databases of known transcription factor binding sites and <it>ab initio </it>motif-finding algorithms. We found strong enrichment of binding sites for the O/E family of transcription factors and for homeodomain factors, both already known to be involved in the transcriptional control of ORs, but did not identify any novel enriched sequences. We also found that TATA-boxes are present in at least a subset of OR promoters.</p> <p>Conclusions</p> <p>Our rigorous approach provides a template for the analysis of the regulation of large gene families and demonstrates some of the difficulties and pitfalls of such analyses. Although currently available bioinformatics methods cannot detect all transcriptional regulatory elements, our thorough analysis of OR promoters shows that in the case of this gene family, experimental approaches have probably already identified all the binding factors common to large fractions of OR promoters.</p
    corecore