202 research outputs found

    Analysis of East Asia Genetic Substructure Using Genome-Wide SNP Arrays

    Get PDF
    Accounting for population genetic substructure is important in reducing type 1 errors in genetic studies of complex disease. As efforts to understand complex genetic disease are expanded to different continental populations the understanding of genetic substructure within these continents will be useful in design and execution of association tests. In this study, population differentiation (Fst) and Principal Components Analyses (PCA) are examined using >200 K genotypes from multiple populations of East Asian ancestry. The population groups included those from the Human Genome Diversity Panel [Cambodian, Yi, Daur, Mongolian, Lahu, Dai, Hezhen, Miaozu, Naxi, Oroqen, She, Tu, Tujia, Naxi, Xibo, and Yakut], HapMap [ Han Chinese (CHB) and Japanese (JPT)], and East Asian or East Asian American subjects of Vietnamese, Korean, Filipino and Chinese ancestry. Paired Fst (Wei and Cockerham) showed close relationships between CHB and several large East Asian population groups (CHB/Korean, 0.0019; CHB/JPT, 00651; CHB/Vietnamese, 0.0065) with larger separation with Filipino (CHB/Filipino, 0.014). Low levels of differentiation were also observed between Dai and Vietnamese (0.0045) and between Vietnamese and Cambodian (0.0062). Similarly, small Fst's were observed among different presumed Han Chinese populations originating in different regions of mainland of China and Taiwan (Fst's <0.0025 with CHB). For PCA, the first two PC's showed a pattern of relationships that closely followed the geographic distribution of the different East Asian populations. PCA showed substructure both between different East Asian groups and within the Han Chinese population. These studies have also identified a subset of East Asian substructure ancestry informative markers (EASTASAIMS) that may be useful for future complex genetic disease association studies in reducing type 1 errors and in identifying homogeneous groups that may increase the power of such studies

    European American Stratification in Ovarian Cancer Case Control Data: The Utility of Genome-Wide Data for Inferring Ancestry

    Get PDF
    We investigated the ability of several principal components analysis (PCA)-based strategies to detect and control for population stratification using data from a multi-center study of epithelial ovarian cancer among women of European-American ethnicity. These include a correction based on an ancestry informative markers (AIMs) panel designed to capture European ancestral variation and corrections utilizing un-thinned genome-wide SNP data; case-control samples were drawn from four geographically distinct North-American sites. The AIMs-only and genome-wide first principal components (PC1) both corresponded to the previously described North or Northwest-Southeast axis of European variation. We found that the genome-wide PCA captured this primary dimension of variation more precisely and identified additional axes of genome-wide variation of relevance to epithelial ovarian cancer. Associations evident between the genome-wide PCs and study site corroborate North American immigration history and suggest that undiscovered dimensions of variation lie within Northern Europe. The structure captured by the genome-wide PCA was also found within control individuals and did not reflect the case-control variation present in the data. The genome-wide PCA highlighted three regions of local LD, corresponding to the lactase (LCT) gene on chromosome 2, the human leukocyte antigen system (HLA) on chromosome 6 and to a common inversion polymorphism on chromosome 8. These features did not compromise the efficacy of PCs from this analysis for ancestry control. This study concludes that although AIMs panels are a cost-effective way of capturing population structure, genome-wide data should preferably be used when available

    Specificity of the STAT4 Genetic Association for Severe Disease Manifestations of Systemic Lupus Erythematosus

    Get PDF
    Systemic lupus erythematosus (SLE) is a genetically complex disease with heterogeneous clinical manifestations. A polymorphism in the STAT4 gene has recently been established as a risk factor for SLE, but the relationship with specific SLE subphenotypes has not been studied. We studied 137 SNPs in the STAT4 region genotyped in 4 independent SLE case series (total n = 1398) and 2560 healthy controls, along with clinical data for the cases. Using conditional testing, we confirmed the most significant STAT4 haplotype for SLE risk. We then studied a SNP marking this haplotype for association with specific SLE subphenotypes, including autoantibody production, nephritis, arthritis, mucocutaneous manifestations, and age at diagnosis. To prevent possible type-I errors from population stratification, we reanalyzed the data using a subset of subjects determined to be most homogeneous based on principal components analysis of genome-wide data. We confirmed that four SNPs in very high LD (r2 = 0.94 to 0.99) were most strongly associated with SLE, and there was no compelling evidence for additional SLE risk loci in the STAT4 region. SNP rs7574865 marking this haplotype had a minor allele frequency (MAF) = 31.1% in SLE cases compared with 22.5% in controls (OR = 1.56, p = 10−16). This SNP was more strongly associated with SLE characterized by double-stranded DNA autoantibodies (MAF = 35.1%, OR = 1.86, p<10−19), nephritis (MAF = 34.3%, OR = 1.80, p<10−11), and age at diagnosis<30 years (MAF = 33.8%, OR = 1.77, p<10−13). An association with severe nephritis was even more striking (MAF = 39.2%, OR = 2.35, p<10−4 in the homogeneous subset of subjects). In contrast, STAT4 was less strongly associated with oral ulcers, a manifestation associated with milder disease. We conclude that this common polymorphism of STAT4 contributes to the phenotypic heterogeneity of SLE, predisposing specifically to more severe disease

    Allele-Specific Gene Expression Is Widespread Across the Genome and Biological Processes

    Get PDF
    Allelic specific gene expression (ASGE) appears to be an important factor in human phenotypic variability and as a consequence, for the development of complex traits and diseases. In order to study ASGE across the human genome, we have performed a study in which genotyping was coupled with an analysis of ASGE by screening 11,500 SNPs using the Mapping 10 K Array to identify differential allelic expression. We found that from the 5,133 SNPs that were suitable for analysis (heterozygous in our sample and expressed in peripheral blood mononuclear cells), 2,934 (57%) SNPs had differential allelic expression. Such SNPs were equally distributed along human chromosomes and biological processes. We validated the presence or absence of ASGE in 18 out 20 SNPs (90%) randomly selected by real time PCR in 48 human subjects. In addition, we observed that SNPs close to -but not included in- segmental duplications had increased levels of ASGE. Finally, we found that transcripts of unknown function or non-coding RNAs, also display ASGE: from a total of 2,308 intronic SNPs, 1510 (65%) SNPs underwent differential allelic expression. In summary, ASGE is a widespread mechanism in the human genome whose regulation seems to be far more complex than expected

    A Genome-Wide Association Study of Psoriasis and Psoriatic Arthritis Identifies New Disease Loci

    Get PDF
    A genome-wide association study was performed to identify genetic factors involved in susceptibility to psoriasis (PS) and psoriatic arthritis (PSA), inflammatory diseases of the skin and joints in humans. 223 PS cases (including 91 with PSA) were genotyped with 311,398 single nucleotide polymorphisms (SNPs), and results were compared with those from 519 Northern European controls. Replications were performed with an independent cohort of 577 PS cases and 737 controls from the U.S., and 576 PSA patients and 480 controls from the U.K.. Strongest associations were with the class I region of the major histocompatibility complex (MHC). The most highly associated SNP was rs10484554, which lies 34.7 kb upstream from HLA-C (P = 7.8×10−11, GWA scan; P = 1.8×10−30, replication; P = 1.8×10−39, combined; U.K. PSA: P = 6.9×10−11). However, rs2395029 encoding the G2V polymorphism within the class I gene HCP5 (combined P = 2.13×10−26 in U.S. cases) yielded the highest ORs with both PS and PSA (4.1 and 3.2 respectively). This variant is associated with low viral set point following HIV infection and its effect is independent of rs10484554. We replicated the previously reported association with interleukin 23 receptor and interleukin 12B (IL12B) polymorphisms in PS and PSA cohorts (IL23R: rs11209026, U.S. PS, P = 1.4×10−4; U.K. PSA: P = 8.0×10−4; IL12B:rs6887695, U.S. PS, P = 5×10−5 and U.K. PSA, P = 1.3×10−3) and detected an independent association in the IL23R region with a SNP 4 kb upstream from IL12RB2 (P = 0.001). Novel associations replicated in the U.S. PS cohort included the region harboring lipoma HMGIC fusion partner (LHFP) and conserved oligomeric golgi complex component 6 (COG6) genes on chromosome 13q13 (combined P = 2×10−6 for rs7993214; OR = 0.71), the late cornified envelope gene cluster (LCE) from the Epidermal Differentiation Complex (PSORS4) (combined P = 6.2×10−5 for rs6701216; OR 1.45) and a region of LD at 15q21 (combined P = 2.9×10−5 for rs3803369; OR = 1.43). This region is of interest because it harbors ubiquitin-specific protease-8 whose processed pseudogene lies upstream from HLA-C. This region of 15q21 also harbors the gene for SPPL2A (signal peptide peptidase like 2a) which activates tumor necrosis factor alpha by cleavage, triggering the expression of IL12 in human dendritic cells. We also identified a novel PSA (and potentially PS) locus on chromosome 4q27. This region harbors the interleukin 2 (IL2) and interleukin 21 (IL21) genes and was recently shown to be associated with four autoimmune diseases (Celiac disease, Type 1 diabetes, Grave's disease and Rheumatoid Arthritis)

    Genomic microsatellites identify shared Jewish ancestry intermediate between Middle Eastern and European populations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genetic studies have often produced conflicting results on the question of whether distant Jewish populations in different geographic locations share greater genetic similarity to each other or instead, to nearby non-Jewish populations. We perform a genome-wide population-genetic study of Jewish populations, analyzing 678 autosomal microsatellite loci in 78 individuals from four Jewish groups together with similar data on 321 individuals from 12 non-Jewish Middle Eastern and European populations.</p> <p>Results</p> <p>We find that the Jewish populations show a high level of genetic similarity to each other, clustering together in several types of analysis of population structure. Further, Bayesian clustering, neighbor-joining trees, and multidimensional scaling place the Jewish populations as intermediate between the non-Jewish Middle Eastern and European populations.</p> <p>Conclusion</p> <p>These results support the view that the Jewish populations largely share a common Middle Eastern ancestry and that over their history they have undergone varying degrees of admixture with non-Jewish populations of European descent.</p

    Impact of the AHI1 Gene on the Vulnerability to Schizophrenia: A Case-Control Association Study

    Get PDF
    BackgroundThe Abelson helper integration-1 (AHI1) gene is required for both cerebellar and cortical development in humans. While the accelerated evolution of AHI1 in the human lineage indicates a role in cognitive (dys)function, a linkage scan in large pedigrees identified AHI1 as a positional candidate for schizophrenia. To further investigate the contribution of AHI1 to the susceptibility of schizophrenia, we evaluated the effect of AHI1 variation on the vulnerability to psychosis in two samples from Spain and Germany.Methodology/Principal Findings29 single-nucleotide polymorphisms (SNPs) located in a genomic region including the AHI1 gene were genotyped in two samples from Spain (280 patients with psychotic disorders; 348 controls) and Germany (247 patients with schizophrenic disorders; 360 controls). Allelic, genotypic and haplotype frequencies were compared between cases and controls in both samples separately, as well as in the combined sample. The effect of genotype on several psychopathological measures (BPRS, KGV, PANSS) assessed in a Spanish subsample was also evaluated. We found several significant associations in the Spanish sample. Particularly, rs7750586 and rs911507, both located upstream of the AHI1 coding region, were found to be associated with schizophrenia in the analysis of genotypic (p = 0.0033, and 0.031, respectively) and allelic frequencies (p = 0.001 in both cases). Moreover, several other risk and protective haplotypes were detected (0.006<p<0.036). Joint analysis also supported the association of rs7750586 and rs911507 with the risk for schizophrenia. The analysis of clinical measures also revealed an effect on symptom severity (minimum P value = 0.0037).Conclusions/SignificanceOur data support, in agreement with previous reports, an effect of AHI1 variation on the susceptibility to schizophrenia in central and southern European populations

    Straightforward Inference of Ancestry and Admixture Proportions through Ancestry-Informative Insertion Deletion Multiplexing

    Get PDF
    Ancestry-informative markers (AIMs) show high allele frequency divergence between different ancestral or geographically distant populations. These genetic markers are especially useful in inferring the likely ancestral origin of an individual or estimating the apportionment of ancestry components in admixed individuals or populations. The study of AIMs is of great interest in clinical genetics research, particularly to detect and correct for population substructure effects in case-control association studies, but also in population and forensic genetics studies

    Comparison of measures of marker informativeness for ancestry and admixture mapping

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Admixture mapping is a powerful gene mapping approach for an admixed population formed from ancestral populations with different allele frequencies. The power of this method relies on the ability of ancestry informative markers (AIMs) to infer ancestry along the chromosomes of admixed individuals. In this study, more than one million SNPs from HapMap databases and simulated data have been interrogated in admixed populations using various measures of ancestry informativeness: Fisher Information Content (FIC), Shannon Information Content (SIC), F statistics (F<sub>ST</sub>), Informativeness for Assignment Measure (I<sub>n</sub>), and the Absolute Allele Frequency Differences (delta, δ). The objectives are to compare these measures of informativeness to select SNP markers for ancestry inference, and to determine the accuracy of AIM panels selected by each measure in estimating the contributions of the ancestors to the admixed population.</p> <p>Results</p> <p>F<sub>ST </sub>and I<sub>n </sub>had the highest Spearman correlation and the best agreement as measured by Kappa statistics based on deciles. Although the different measures of marker informativeness performed comparably well, analyses based on the top 1 to 10% ranked informative markers of simulated data showed that I<sub>n </sub>was better in estimating ancestry for an admixed population.</p> <p>Conclusions</p> <p>Although millions of SNPs have been identified, only a small subset needs to be genotyped in order to accurately predict ancestry with a minimal error rate in a cost-effective manner. In this article, we compared various methods for selecting ancestry informative SNPs using simulations as well as SNP genotype data from samples of admixed populations and showed that the I<sub>n </sub>measure estimates ancestry proportion (in an admixed population) with lower bias and mean square error.</p
    corecore