31 research outputs found

    Estimating Sampling Selection Bias in Human Genetics: A Phenomenological Approach

    Get PDF
    This research is the first empirical attempt to calculate the various components of the hidden bias associated with the sampling strategies routinely-used in human genetics, with special reference to surname-based strategies. We reconstructed surname distributions of 26 Italian communities with different demographic features across the last six centuries (years 1447-2001). The degree of overlapping between "reference founding core" distributions and the distributions obtained from sampling the present day communities by probabilistic and selective methods was quantified under different conditions and models. When taking into account only one individual per surname (low kinship model), the average discrepancy was 59.5%, with a peak of 84% by random sampling. When multiple individuals per surname were considered (high kinship model), the discrepancy decreased by 8-30% at the cost of a larger variance. Criteria aimed at maximizing locally-spread patrilineages and long-term residency appeared to be affected by recent gene flows much more than expected. Selection of the more frequent family names following low kinship criteria proved to be a suitable approach only for historically stable communities. In any other case true random sampling, despite its high variance, did not return more biased estimates than other selective methods. Our results indicate that the sampling of individuals bearing historically documented surnames (founders' method) should be applied, especially when studying the male-specific genome, to prevent an over-stratification of ancient and recent genetic components that heavily biases inferences and statistics

    Association between Variants of the TRPV1 Gene and Body Composition in Sub-Saharan Africans

    Get PDF
    In humans, the transient receptor potential vanilloid 1 (TRPV1) gene is activated by exogenous (e.g., high temperatures, irritating compounds such as capsaicin) and endogenous (e.g., endocannabinoids, inflammatory factors, fatty acid metabolites, low pH) stimuli. It has been shown to be involved in several processes including nociception, thermosensation, and energy homeostasis. In this study, we investigated the association between TRPV1 gene variants, sensory perception (to capsaicin and PROP), and body composition (BMI and bioimpedance variables) in human populations. By comparing sequences deposited in worldwide databases, we identified two haplotype blocks (herein referred to as H1 and H2) that show strong stabilizing selection signals (MAF approaching 0.50, Tajima’s D > +4.5) only in individuals with sub-Saharan African ancestry. We therefore studied the genetic variants of these two regions in 46 volunteers of sub-Saharan descent and 45 Italian volunteers (both sexes). Linear regression analyses showed significant associations between TRPV1 diplotypes and body composition, but not with capsaicin perception. Specifically, in African women carrying the H1-b and H2-b haplotypes, a higher percentage of fat mass and lower extracellular fluid retention was observed, whereas no significant association was found in men. Our results suggest the possible action of sex-driven balancing selection at the non-coding sequences of the TRPV1 gene, with adaptive effects on water balance and lipid deposition

    Genetic History of the Population of Corsica (Western Mediterranean) as Inferred from Autosomal STR Analysis

    No full text
    To genetically reconstruct the demographic history of the human population of Corsica (western Mediterranean), we analyzed the variability at eight autosomal STR loci (FES, VWA, CSF1PO, TH01, F13A1, TPOX, CD4, and D3S1358) in a sample of 179 native blood donors from 4 out of the 5 administrative districts. The main line of genetic discontinuity inferred from the spatial distribution of STR variability overlapped the linguistic and geographic boundaries. In the innermost areas (Corte district) several estimators had larger stochastic effects on allele frequencies. Genetic distance measures underlying different evolutionary models all pointed to a higher variability within Corsicans than within the rest of the Mediterranean reference populations. All Corsican subsamples showed the highest distance with a pooled sample from central Sardinia, thus making recent gene flow between the two neighboring islands unlikely. Hierarchical AMOVA and distance-based multivariate genetic spaces stressed the closeness of Tuscan and Corsican frequency distributions, which could reflect peopling events with different time depths. Anyway, estimated separation times well support the linguistic hypothesis that Neolithic/Chalcolithic events have been far more important than Paleolithic or historical processes in the shaping of present Corsican variability

    Tools to predicting binary states on the human Y chromosome from STR data

    No full text
    A novel Bayesian algorithm, “WYZARD”, is designed to predicting binary states on the human Y chromosome from STR data. It allows users to retrieve linkage probabilities between combination of alleles at the 8 most widely used STR loci (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS388) and the derived mutations defining 1 super-haplogroup [F(xK)], 4 haplogroups (I, L, N, Q) and 14 sub-haplogroups (E3a, E3b1a, E3b1b, E3b3, G1, G2, I1a, I1b, I1c, J1, J2, R1a, R1b, R2), which encompass 99% of West Eurasian variability. Prior probabilities were calculated from a geographically unbiased repository of 3,672 chromosomes we collected from published and unpublished sources. The robustness of the WYZARD approach and of other six approaches of haplogroup assignment following distance-, Bayesian- and frequencybased methods was assess by comparing predictions against the true haplogroup of 135 haplotypes with Austrian origin. Incorrect assignments ranged between 11.1% and 16.3%, with WYZARD giving the lower values among Bayesian methods (14.8 %). Being misleading results limited to few couples of haplogroups, a 100% rate of correct assignments can be reached introducing STR-based predictions in routine protocols for Y binary screening. It would shortcut the diagnosis of binary mutations with costs 50-70% lower than standing-alone approaches
    corecore