74 research outputs found
Genome Sequence-Based Discriminator for Vancomycin-Intermediate Staphylococcus aureus
Vancomycin is the mainstay of treatment for patients with Staphylococcus aureus infections, and reduced susceptibility to vancomycin is becoming increasingly common. Accordingly, the development of rapid and accurate assays for the diagnosis of vancomycin-intermediate S. aureus (VISA) will be critical. We developed and applied a genome-based machine-learning approach for discrimination between VISA and vancomycin-susceptible S. aureus (VSSA) using 25 whole-genome sequences. The resulting machine-learning model, based on 14 gene parameters, including 3 molecular typing markers and 11 genes implicated in reduced vancomycin susceptibility, is able to unambiguously distinguish between the VISA and VSSA isolates analyzed here despite the fact that they do not form evolutionarily distinct groups. As such, the model is able to discriminate based on specific genomic markers of antibiotic susceptibility rather than overall sequence relatedness. Subsequent evaluation of the model using leave-one-out validation yielded a classification accuracy of 84%. The machine-learning approach described here provides a generalized framework for the application of genome sequence analysis to the classification of bacteria that differ with respect to clinically relevant phenotypes and should be particularly useful in defining the genomic features that underlie antibiotic resistance
Assortative Mating on Ancestry-Variant Traits in Admixed Latin American Populations
Assortative mating is a universal feature of human societies, and individuals from ethnically diverse populations are known to mate assortatively based on similarities in genetic ancestry. However, little is currently known regarding the exact phenotypic cues, or their underlying genetic architecture, which inform ancestry-based assortative mating. We developed a novel approach, using genome-wide analysis of ancestry-specific haplotypes, to evaluate ancestry-based assortative mating on traits whose expression varies among the three continental population groups – African, European, and Native American – that admixed to form modern Latin American populations. Application of this method to genome sequences sampled from Colombia, Mexico, Peru, and Puerto Rico revealed widespread ancestry-based assortative mating. We discovered a number of anthropometric traits (body mass, height, and facial development) and neurological attributes (educational attainment and schizophrenia) that serve as phenotypic cues for ancestry-based assortative mating. Major histocompatibility complex (MHC) loci show population-specific patterns of both assortative and disassortative mating in Latin America. Ancestry-based assortative mating in the populations analyzed here appears to be driven primarily by African ancestry. This study serves as an example of how population genomic analyses can yield novel insights into human behavior
Lateral Gene Transfer in a Heavy Metal-Contaminated-Groundwater Microbial Community
Unraveling the drivers controlling the response and adaptation of biological communities to environmental change, especially anthropogenic activities, is a central but poorly understood issue in ecology and evolution. Comparative genomics studies suggest that lateral gene transfer (LGT) is a major force driving microbial genome evolution, but its role in the evolution of microbial communities remains elusive. To delineate the importance of LGT in mediating the response of a groundwater microbial community to heavy metal contamination, representative Rhodanobacter reference genomes were sequenced and compared to shotgun metagenome sequences. 16S rRNA gene-based amplicon sequence analysis indicated that Rhodanobacter populations were highly abundant in contaminated wells with low pHs and high levels of nitrate and heavy metals but remained rare in the uncontaminated wells. Sequence comparisons revealed that multiple geochemically important genes, including genes encoding Fe2+/Pb2+ permeases, most denitrification enzymes, and cytochrome c553, were native to Rhodanobacter and not subjected to LGT. In contrast, the Rhodanobacter pangenome contained a recombinational hot spot in which numerous metal resistance genes were subjected to LGT and/or duplication. In particular, Co2+/Zn2+/Cd2+ efflux and mercuric resistance operon genes appeared to be highly mobile within Rhodanobacter populations. Evidence of multiple duplications of a mercuric resistance operon common to most Rhodanobacter strains was also observed. Collectively, our analyses indicated the importance of LGT during the evolution of groundwater microbial communities in response to heavy metal contamination, and a conceptual model was developed to display such adaptive evolutionary processes for explaining the extreme dominance of Rhodanobacter populations in the contaminated groundwater microbiome
Population Pharmacogenomics for Precision Public Health in Colombia
While genomic approaches to precision medicine hold great promise, they remain prohibitively expensive for developing countries. The precision public health paradigm, whereby healthcare decisions are made at the level of populations as opposed to individuals, provides one way for the genomics revolution to directly impact health outcomes in the developing world. Genomic approaches to precision public health require a deep understanding of local population genomics, which is still missing for many developing countries. We are investigating the population genomics of genetic variants that mediate drug response in an effort to inform healthcare decisions in Colombia. Our work focuses on two neighboring populations with distinct ancestry profiles: Antioquia and Chocó. Antioquia has primarily European genetic ancestry followed by Native American and African components, whereas Chocó shows mainly African ancestry with lower levels of Native American and European admixture. We performed a survey of the global distribution of pharmacogenomic variants followed by a more focused study of pharmacogenomic allele frequency differences between the two Colombian populations. Worldwide, we found pharmacogenomic variants to have both unusually high minor allele frequencies and high levels of population differentiation. A number of these pharmacogenomic variants also show anomalous effect allele frequencies within and between the two Colombian populations, and these differences were found to be associated with their distinct genetic ancestry profiles. For example, the C allele of the single nucleotide polymorphism (SNP) rs4149056 [Solute Carrier Organic Anion Transporter Family Member 1B1 (SLCO1B1)∗5], which is associated with an increased risk of toxicity to a commonly prescribed statin, is found at relatively high frequency in Antioquia and is associated with European ancestry. In addition to pharmacogenomic alleles related to increased toxicity risk, we also have evidence that alleles related to dosage and metabolism have large frequency differences between the two populations, which are associated with their specific ancestries. Using these findings, we have developed and validated an inexpensive allele-specific PCR assay to test for the presence of such population-enriched pharmacogenomic SNPs in Colombia. These results serve as an example of how population-centered approaches to pharmacogenomics can help to realize the promise of precision medicine in resource-limited settings
Population genomics of human polymorphic transposable elements
Transposable element (TE) activity has had a major impact on the human genome; more than two-thirds of the sequence is derived from TE insertions. Several families of human TEs – primarily Alu, L1 and SVA – continue to actively transpose, thereby generating insertion polymorphisms between individuals. Until very recently, it has not been possible to characterize the genetic variation generated by the activity of these TE families at the scale of whole genomes for multiple individuals within and between human populations. For this reason, the impact of recent TE activity on human evolution has yet to be fully appreciated. My dissertation research leverages novel technologies in data science to investigate the role that recent TE activity has played in shaping human population genetic variation. Specifically, my dissertation addresses three problems: 1) evaluation of the computational techniques used to characterize human polymorphic TE insertion sites from whole genome, next-generation sequence data, 2) characterization of the population genomic variation of human polymorphic TEs and evaluation of their effectiveness as markers of human genetic ancestry and admixture, and 3) analysis of the effects that natural selection (negative and positive) has exerted on human polymorphic TE insertions. I close by presenting a broad prospectus on the implications of genome-scale analyses of human polymorphic TE insertions for population and clinical genetic studies. The results reported in this dissertation represent the dawn of the population genomics era for human TEs.Ph.D
Nitrogen fixing microbial communities in oiled and clean sands of Pensacola Beach, July 2010 and June 2011, and sugarcane and peat samples
These data report the diversity and composition of nitrogen-fixing microbial communities sampled from beach sands collected from Pensacola Beach. Nitrogen-fixers were characterized by the extraction, PCR amplification, and next generation sequencing of the nitrogenase gene (nifH). Sugarcane and peat samples were also analyzed
Additional file 2: Table S2. of Implications of human evolution and admixture for mitochondrial replacement therapy
Counts of mtDNA haplogroups for the HGDP populations analyzed here. Global population distributions of mtDNA haplogroups are organized as shown for the 1KGP in Table 1. (XLSX 14Â kb
Additional file 1: Table S1. of Implications of human evolution and admixture for mitochondrial replacement therapy
HGDP populations analyzed in this study. Figure S1. (A) Phylogenetic tree based on mtDNA haplotype genetic distances and (B) dendogram showing previously defined relationships among major mtDNA haplogroups. (DOCX 344Â kb
Patterns of transposable element expression and insertion in cancer
Human transposable element (TE) activity in somatic tissues causes mutations that can contribute to tumorigenesis. Indeed, TE insertion mutations have been implicated in the etiology of a number of different cancer types. Nevertheless, the full extent of somatic TE activity, along with its relationship to tumorigenesis, have yet to be fully explored. Recent developments in bioinformatics software make it possible to analyze TE expression levels and TE insertional activity directly from transcriptome (RNA-seq) and whole genome (DNA-seq) next-generation sequence data. We applied these new sequence analysis techniques to matched normal and primary tumor patient samples from the Cancer Genome Atlas (TCGA) in order to analyze the patterns of TE expression and insertion for three cancer types: breast invasive carcinoma, head and neck squamous cell carcinoma, and lung adenocarcinoma. Our analysis focused on the three most abundant families of active human TEs: Alu, SVA and L1. We found evidence for high levels of somatic TE activity for these three families in normal and cancer samples across diverse tissue types. Abundant transcripts for all three TE families were detected in both normal and cancer tissues along with an average of ~80 unique TE insertions per individual patient/tissue. We observed an increase in L1 transcript expression and L1 insertional activity in primary tumor samples for all three cancer types. Tumor-specific TE insertions are enriched for private mutations, consistent with a potentially causal role in tumorigenesis. We used genome feature analysis to investigate two specific cases of putative cancer-causing TE mutations in further detail. An Alu insertion in an upstream enhancer of the CBL tumor suppressor gene is associated with down-regulation of the gene in a single breast cancer patient, and an L1 insertion in the first exon of the BAALC gene also disrupts its expression in head and neck squamous cell carcinoma. Our results are consistent with widespread somatic activity of human TEs leading to numerous insertion mutations that can contribute to tumorigenesis in a variety of tissues
- …