57 research outputs found

    Genetic Relatedness of Indigenous Ethic Groups in Northern Borneo to Neighboring Populations from Southeast Asia, as inferred from Genome-wide SNP Data

    Get PDF
    The region of northern Borneo is home to the current state of Sabah, Malaysia. It is located closest to the southern Philippine islands and may have served as a viaduct for ancient human migration onto or off of Borneo Island. In this study, five indigenous ethnic groups from Sabah were subjected to genome-wide SNP genotyping. These individuals represent the "North Borneo"-speaking group of the great Austronesian family. They have traditionally resided in the inland region of Sabah. The dataset was merged with public datasets, and the genetic relatedness of these groups to neighboring populations from the islands of Southeast Asia, mainland Southeast Asia and southern China was inferred. Genetic structure analysis revealed that these groups formed a genetic cluster that was independent of the clusters of neighboring populations. Additionally, these groups exhibited near-absolute proportions of a genetic component that is also common among Austronesians from Taiwan and the Philippines. They showed no genetic admixture with Austro-Melanesian populations. Furthermore, phylogenetic analysis showed that they are closely related to non-Austro-Melansian Filipinos as well as to Taiwan natives but are distantly related to populations from mainland Southeast Asia. Relatively lower heterozygosity and higher pairwise genetic differentiation index (FST ) values than those of nearby populations indicate that these groups might have experienced genetic drift in the past, resulting in their differentiation from other Austronesians. Subsequent formal testing suggested that these populations have received no gene flow from neighboring populations. Taken together, these results imply that the indigenous ethnic groups of northern Borneo shared a common ancestor with Taiwan natives and non-Austro-Melanesian Filipinos and then isolated themselves on the inland of Sabah. This isolation presumably led to no admixture with other populations, and these individuals therefore underwent strong genetic differentiation. This report contributes to addressing the paucity of genetic data on representatives from this strategic region of ancient human migration event(s)

    Shared signature of recent positive selection on the TSBP1-BTNL2-HLA-DRA genes in five native populations from North Borneo

    Get PDF
    North Borneo (NB) is home to more than 40 native populations. These natives are believed to have undergone local adaptation in response to environmental challenges such as the mosquito-abundant tropical rainforest. We attempted to trace the footprints of natural selection from the genomic data of NB native populations using a panel of 2.2 million genome-wide single nucleotide polymorphisms. As a result, an 13-kb haplotype in the Major Histocompatibility Complex Class II region encompassing candidate genes TSBP1–BTNL2–HLA-DRA was identified to be undergoing natural selection. This putative signature of positive selection is shared among the five NB population sandis estimated to have arisen5.5thousand years(220generations) ago, which coincides with the period of Austronesian expansion. Owing to the long history of endemic malaria in NB, the putative signature of positive selection is postulated to be driven by Plasmodium parasite infection. The findings of this study imply that despite high levels of genetic differentiation, the NB populations might have experienced similar local genetic adaptation resulting from stresses of the shared environment

    Genetic relatedness of indigenous ethnic groups in northern Borneo to neighboring populations from Southeast Asia, as inferred from genome-wide SNP data

    Get PDF
    The region of northern Borneo is home to the current state of Sabah, Malaysia. It is located closest to the southern Philippine islands and may have served as a viaduct for ancient human migration onto or off of Borneo Island. In this study, five indigenous ethnic groups from Sabah were subjected to genome-wide SNP genotyping. These individuals represent the “North Borneo”-speaking group of the great Austronesian family. They have traditionally resided in the inland region of Sabah. The dataset was merged with public datasets, and the genetic relatedness of these groups to neighboring populations from the islands of Southeast Asia, mainland Southeast Asia and southern China was inferred. Genetic structure analysis revealed that these groups formed a genetic cluster that was independent of the clusters of neighboring populations. Additionally, these groups exhibited near-absolute proportions of a genetic component that is also common among Austronesians from Taiwan and the Philippines. They showed no genetic admixture with Austro-Melanesian populations. Furthermore, phylogenetic analysis showed that they are closely related to non–Austro-Melansian Filipinos as well as to Taiwan natives but are distantly related to populations from mainland Southeast Asia. Relatively lower heterozygosity and higher pairwise genetic differentiation index (FST) values than those of nearby populations indicate that these groups might have experienced genetic drift in the past, resulting in their differentiation from other Austronesians. Subsequent formal testing suggested that these populations have received no gene flow from neighboring populations. Taken together, these results imply that the indigenous ethnic groups of northern Borneo shared a common ancestor with Taiwan natives and non–Austro- Melanesian Filipinos and then isolated themselves on the inland of Sabah. This isolation presumably led to no admixture with other populations, and these individuals therefore underwent strong genetic differentiation. This report contributes to addressing the paucity of genetic data on representatives from this strategic region of ancient human migration event(s)

    MAPPING THE NATURAL VARIATION OF GLOBAL POPULATIONS

    No full text
    Ph.DDOCTOR OF PHILOSOPHY (SPH

    The Carbon Footprint of Bioinformatics.

    No full text
    Funder: Wellcome TrustBioinformatic research relies on large-scale computational infrastructures which have a nonzero carbon footprint but so far, no study has quantified the environmental costs of bioinformatic tools and commonly run analyses. In this work, we estimate the carbon footprint of bioinformatics (in kilograms of CO2 equivalent units, kgCO2e) using the freely available Green Algorithms calculator (www.green-algorithms.org, last accessed 2022). We assessed 1) bioinformatic approaches in genome-wide association studies (GWAS), RNA sequencing, genome assembly, metagenomics, phylogenetics, and molecular simulations, as well as 2) computation strategies, such as parallelization, CPU (central processing unit) versus GPU (graphics processing unit), cloud versus local computing infrastructure, and geography. In particular, we found that biobank-scale GWAS emitted substantial kgCO2e and simple software upgrades could make it greener, for example, upgrading from BOLT-LMM v1 to v2.3 reduced carbon footprint by 73%. Moreover, switching from the average data center to a more efficient one can reduce carbon footprint by approximately 34%. Memory over-allocation can also be a substantial contributor to an algorithm's greenhouse gas emissions. The use of faster processors or greater parallelization reduces running time but can lead to greater carbon footprint. Finally, we provide guidance on how researchers can reduce power consumption and minimize kgCO2e. Overall, this work elucidates the carbon footprint of common analyses in bioinformatics and provides solutions which empower a move toward greener research

    Mapping the genetic diversity of HLA haplotypes in the Japanese populations

    Get PDF
    Japan has often been viewed as an Asian country that possesses a genetically homogenous community. The basis for partitioning the country into prefectures has largely been geographical, although cultural and linguistic differences still exist between some of the districts/prefectures, especially between Okinawa and the mainland prefectures. The Major Histocompatibility Complex (MHC) region has consistently emerged as the most polymorphic region in the human genome, harbouring numerous biologically important variants; nevertheless the presence of population-specific long haplotypes hinders the imputation of SNPs and classical HLA alleles. Here, we examined the extent of genetic variation at the MHC between eight Japanese populations sampled from Okinawa and six other prefectures located in or close to the mainland of Japan, specifically focusing at the haplotypes observed within each population and what the impact of any variation has on imputation. Our results indicated that Okinawa was genetically farther to the mainland Japanese than were Gujarati Indians from Tamil Indians, while the mainland Japanese from six prefectures were more homogeneous than between northern and southern Han Chinese. The distribution of haplotypes across Japan was similar, although imputation was most accurate for Okinawa and several mainland prefectures when population-specific panels were used as reference

    The HLA-DR beta 1 amino acid positions 11-13-26 explain the majority of SLE-MHC associations

    No full text
    Genetic association of the major histocompatibility complex (MHC) locus is well established in systemic lupus erythematosus (SLE), but the causal functional variants in this region have not yet been discovered. Here we conduct the first fine-mapping study, which thoroughly investigates the SLE-MHC associations down to the amino acid level of major HLA genes in 5,342 unrelated Korean case-control subjects, taking advantages of HLA imputation with a newly constructed Asian HLA reference panel. The most significant association is mapped to amino acid position 13 of HLA-DR beta 1 (P = 2.48 x 10(-17)) and its proxy position 11 (P = 4.15 x 10(-17)), followed by position 26 in a stepwise conditional analysis (P = 2.42 x 10(-9)). Haplotypes defined by amino acid positions 11-13-26 support the reported effects of most classical HLA-DRB1 alleles in Asian and European populations. In conclusion, our study identifies the three amino acid positions at the epitope-binding groove of HLA-DR beta 1 that are responsible for most of the association between SLE and MHC.Y

    Discordance (%) between imputed genotypes and actually observed minor allele genotypes<sup>1</sup> at rare and low-frequency SNPs on the exome chip but not in the Omni2.5.

    No full text
    1<p>A minor allele genotype is defined as a genotype that carries at least one copy of the minor allele, and discordance here is measured against the total number of observed minor allele genotypes at rare and low-frequency SNPs.</p>2<p>Phase 1 of the 1KGP, consisting of 1,092 subjects.</p>3<p>Singapore Sequencing Malay Project, consisting of 96 Southeast Asian Malays that have been whole-genome sequenced at 30X.</p>4<p>Singapore Sequencing Indian Project, consisting of 36 South Asian Indians that have been whole-genome sequenced at 30X.</p><p>Discordance (%) between imputed genotypes and actually observed minor allele genotypes<sup><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0106681#nt104" target="_blank">1</a></sup> at rare and low-frequency SNPs on the exome chip but not in the Omni2.5.</p

    Evaluating the Coverage and Potential of Imputing the Exome Microarray with Next-Generation Imputation Using the 1000 Genomes Project - Figure 1

    No full text
    <p>(A) The proportion of monomorphic and polymorphic exonic variants in the Illumina exome chip when assessed in each of the three Singapore populations. The exonic variants on the exome chip are further categorized according to whether they are present in any of the reference panels from the 1000 Genomes Project or the Singapore Sequencing Study for the Malays and Indians (“Covered”) and can in theory be imputed, or not present in any of the existing reference panels and thus cannot be recovered through imputation (“Not covered”). (B) Distribution of SNPs on the exome chip according to the minor allele frequencies (MAFs) into monomorphic (MAF  = 0%), rare (0%< MAF ≤1%), low-frequency (1%< MAF ≤5%) and common (MAF >5%) in each of the three populations. (C) MAF categorization of the polymorphic exome chip SNPs in each of the three populations according to whether these SNPs are present (non-purple bars) or not (purple bars) in the respective reference panels. Numbers in brackets indicate the number of SNPs in the respective categories.</p
    corecore