30 research outputs found

    The light skin allele of SLC24A5 in South Asians and Europeans shares identity by descent.

    Get PDF
    Skin pigmentation is one of the most variable phenotypic traits in humans. A non-synonymous substitution (rs1426654) in the third exon of SLC24A5 accounts for lighter skin in Europeans but not in East Asians. A previous genome-wide association study carried out in a heterogeneous sample of UK immigrants of South Asian descent suggested that this gene also contributes significantly to skin pigmentation variation among South Asians. In the present study, we have quantitatively assessed skin pigmentation for a largely homogeneous cohort of 1228 individuals from the Southern region of the Indian subcontinent. Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the cohort studied. Our extensive survey of the polymorphism in 1573 individuals from 54 ethnic populations across the Indian subcontinent reveals wide presence of the derived-A allele, although the frequencies vary substantially among populations. We also show that the geospatial pattern of this allele is complex, but most importantly, reflects strong influence of language, geography and demographic history of the populations. Sequencing 11.74 kb of SLC24A5 in 95 individuals worldwide reveals that the rs1426654-A alleles in South Asian and West Eurasian populations are monophyletic and occur on the background of a common haplotype that is characterized by low genetic diversity. We date the coalescence of the light skin associated allele at 22-28 KYA. Both our sequence and genome-wide genotype data confirm that this gene has been a target for positive selection among Europeans. However, the latter also shows additional evidence of selection in populations of the Middle East, Central Asia, Pakistan and North India but not in South India

    The light skin allele of SLC24A5 in South Asians and Europeans shares identity by descent

    Get PDF
    Skin pigmentation is one of the most variable phenotypic traits in humans. A non-synonymous substitution (rs1426654) in the third exon of SLC24A5 accounts for lighter skin in Europeans but not in East Asians. A previous genome-wide association study carried out in a heterogeneous sample of UK immigrants of South Asian descent suggested that this gene also contributes significantly to skin pigmentation variation among South Asians. In the present study, we have quantitatively assessed skin pigmentation for a largely homogeneous cohort of 1228 individuals from the Southern region of the Indian subcontinent. Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the cohort studied. Our extensive survey of the polymorphism in 1573 individuals from 54 ethnic populations across the Indian subcontinent reveals wide presence of the derived-A allele, although the frequencies vary substantially among populations. We also show that the geospatial pattern of this allele is complex, but most importantly, reflects strong influence of language, geography and demographic history of the populations. Sequencing 11.74 kb of SLC24A5 in 95 individuals worldwide reveals that the rs1426654-A alleles in South Asian and West Eurasian populations are monophyletic and occur on the background of a common haplotype that is characterized by low genetic diversity. We date the coalescence of the light skin associated allele at 22-28 KYA. Both our sequence and genome-wide genotype data confirm that this gene has been a target for positive selection among Europeans. However, the latter also shows additional evidence of selection in populations of the Middle East, Central Asia, Pakistan and North India but not in South India

    Genotype-Phenotype Study of the Middle Gangetic Plain in India Shows Association of rs2470102 with Skin Pigmentation

    Get PDF
    Our understanding of the genetics of skin pigmentation has been largely skewed towards populations of European ancestry, imparting less attention to South Asian populations, who behold huge pigmentation diversity. Here, we investigate skin pigmentation variation in a cohort of 1,167 individuals in the Middle Gangetic Plain of the Indian subcontinent. Our data confirm the association of rs1426654 with skin pigmentation among South Asians, consistent with previous studies, and also show association for rs2470102 single nucleotide polymorphism. Our haplotype analyses further help us delineate the haplotype distribution across social categories and skin color. Taken together, our findings suggest that the social structure defined by the caste system in India has a profound influence on the skin pigmentation patterns of the subcontinent. In particular, social category and associated single nucleotide polymorphisms explain about 32% and 6.4%, respectively, of the total phenotypic variance. Phylogeography of the associated single nucleotide polymorphisms studied across 52 diverse populations of the Indian subcontinent shows wide presence of the derived alleles, although their frequencies vary across populations. Our results show that both polymorphisms (rs1426654 and rs2470102) play an important role in the skin pigmentation diversity of South Asians

    On reparameterization of random effects in linear mixed models

    No full text
    The Empirical Best Linear Unbiased Predictor of random effects in linear mixed models may be non-unique. For fixed effects two approaches are used to derive unique solutions – one is based on using estimable linear combinations of parameters and the other one uses reparametrization constraints. It is shown that both approaches can be applied in a similar manner to derive unique prediction results for random effects

    Invariance of the BLUE under the linear fixed and mixed effects models

    No full text
    We consider the estimation of the parametric function X1ÎČ1 under the partitioned linear fixed effects model y=X1ÎČ1+X2ÎČ2+Δ and the linear mixed effects model y=X1ÎČ1+X2Îł2+Δ,  where Îł2 is considered to be a random vector. Particularly, we consider when the best linear unbiased estimator, BLUE, of X1ÎČ1 under the linear fixed effects model equals the corresponding BLUE under the linear mixed effects model

    PlasmidSeeker: identification of known plasmids from bacterial whole genome sequencing reads

    No full text
    Background Plasmids play an important role in the dissemination of antibiotic resistance, making their detection an important task. Using whole genome sequencing (WGS), it is possible to capture both bacterial and plasmid sequence data, but short read lengths make plasmid detection a complex problem. Results We developed a tool named PlasmidSeeker that enables the detection of plasmids from bacterial WGS data without read assembly. The PlasmidSeeker algorithm is based on k-mers and uses k-mer abundance to distinguish between plasmid and bacterial sequences. We tested the performance of PlasmidSeeker on a set of simulated and real bacterial WGS samples, resulting in 100% sensitivity and 99.98% specificity. Conclusion PlasmidSeeker enables quick detection of known plasmids and complements existing tools that assemble plasmids de novo. The PlasmidSeeker source code is stored on GitHub: https://github.com/bioinfo-ut/PlasmidSeeker

    StrainSeeker: fast identification of bacterial strains from raw sequencing reads using user-provided guide trees

    No full text
    ABSTRACT Background: Fast, accurate and high-throughput identification of bacterial isolates is in great demand. The present work was conducted to investigate the possibility of identifying isolates from unassembled next-generation sequencing reads using custom-made guide trees. Results: A tool named StrainSeeker was developed that constructs a list of specific k-mers for each node of any given Newick-format tree and enables the identification of bacterial isolates in 1-2 min. It uses a novel algorithm, which analyses the observed and expected fractions of node-specific k-mers to test the presence of each node in the sample. This allows StrainSeeker to determine where the isolate branches off the guide tree and assign it to a clade whereas other tools assign each read to a reference genome. Using a dataset of 100 Escherichia coli isolates, we demonstrate that StrainSeeker can predict the clades of E. coli with 92% accuracy and correct tree branch assignment with 98% accuracy. Twenty-five thousand Illumina HiSeq reads are sufficient for identification of the strain. Conclusion: StrainSeeker is a software program that identifies bacterial isolates by assigning them to nodes or leaves of a custom-made guide tree. StrainSeeker's web interface and pre-computed guide trees are available a

    Phasing and allelic composition of normal and CNV-carrying haplotypes on parental homologous chromosomes.

    No full text
    <p>A chromosomal region involving copy number variation is denoted with ‘R2’. In the given example, father is the carrier of two normal haplotypes of ‘R2’ on chromosomes P1 and P2 (diploid copy number of ‘R2’, CN = 2), whereas mother has a combination of a duplication-carrying (on M1) and normal (M2) haplotypes (diploid copy number of ‘R2’, CN = 3). Haplotype-informative SNP genotypes in ‘R2’ sequence that can be used for phasing and determining the parental origin (in offspring) of given normal and CNV-carrying haplotypes are given in bold letters and genotypes that are polymorphic <i>between</i> normal or duplication-carrying parental haplotypes are indicated with dashed rectangles. The duplication-carrying haplotype on maternal M1 chromosome is composed of two allelic copies of the sequence ‘R2’ distinguished by genotype variability at position SNP7 (polymorphic SNP variant <i>within</i> the duplication-carrying haplotype), indicated with dotted rectangle.</p

    Computational phasing of normal and CNV-carrying haplotypes.

    No full text
    <p>(<b>A</b>) First, CNV and regular two-letter genotypes are collected from the QuantiSNP output for each family member at a locus of interest. (<b>B</b>) Next, markers that have any low-confidence genotype calls or the call could not have been made (‘NC’ genotypes, e.g. marker rs10801575, marked with the red background) and monomorphic markers that are not informative for haplotype phasing in the studied region (e.g. marker rs7517836; marked with the red background) are filtered out. (<b>C</b>) Informative high-confidence genotypes are then phased considering all family members simultaneously and the resulting haplotypes are presented as the result. (<b>D</b>) The family tree of these phased haplotypes can be further visualised for the corresponding CNV region.</p
    corecore