1,190 research outputs found

    Assessment of the genetic basis of rosacea by genome-wide association study.

    Get PDF
    Rosacea is a common, chronic skin disease that is currently incurable. Although environmental factors influence rosacea, the genetic basis of rosacea is not established. In this genome-wide association study, a discovery group of 22,952 individuals (2,618 rosacea cases and 20,334 controls) was analyzed, leading to identification of two significant single-nucleotide polymorphisms (SNPs) associated with rosacea, one of which replicated in a new group of 29,481 individuals (3,205 rosacea cases and 26,262 controls). The confirmed SNP, rs763035 (P=8.0 × 10(-11) discovery group; P=0.00031 replication group), is intergenic between HLA-DRA and BTNL2. Exploratory immunohistochemical analysis of HLA-DRA and BTNL2 expression in papulopustular rosacea lesions from six individuals, including one with the rs763035 variant, revealed staining in the perifollicular inflammatory infiltrate of rosacea for both proteins. In addition, three HLA alleles, all MHC class II proteins, were significantly associated with rosacea in the discovery group and confirmed in the replication group: HLA-DRB1*03:01 (P=1.0 × 10(-8) discovery group; P=4.4 × 10(-6) replication group), HLA-DQB1*02:01 (P=1.3 × 10(-8) discovery group; P=7.2 × 10(-6) replication group), and HLA-DQA1*05:01 (P=1.4 × 10(-8) discovery group; P=7.6 × 10(-6) replication group). Collectively, the gene variants identified in this study support the concept of a genetic component for rosacea, and provide candidate targets for future studies to better understand and treat rosacea

    Accurate HLA type inference using a weighted similarity graph

    Get PDF
    Abstract Background The human leukocyte antigen system (HLA) contains many highly variable genes. HLA genes play an important role in the human immune system, and HLA gene matching is crucial for the success of human organ transplantations. Numerous studies have demonstrated that variation in HLA genes is associated with many autoimmune, inflammatory and infectious diseases. However, typing HLA genes by serology or PCR is time consuming and expensive, which limits large-scale studies involving HLA genes. Since it is much easier and cheaper to obtain single nucleotide polymorphism (SNP) genotype data, accurate computational algorithms to infer HLA gene types from SNP genotype data are in need. To infer HLA types from SNP genotypes, the first step is to infer SNP haplotypes from genotypes. However, for the same SNP genotype data set, the haplotype configurations inferred by different methods are usually inconsistent, and it is often difficult to decide which one is true. Results In this paper, we design an accurate HLA gene type inference algorithm by utilizing SNP genotype data from pedigrees, known HLA gene types of some individuals and the relationship between inferred SNP haplotypes and HLA gene types. Given a set of haplotypes inferred from the genotypes of a population consisting of many pedigrees, the algorithm first constructs a weighted similarity graph based on a new haplotype similarity measure and derives constraint edges from known HLA gene types. Based on the principle that different HLA gene alleles should have different background haplotypes, the algorithm searches for an optimal labeling of all the haplotypes with unknown HLA gene types such that the total weight among the same HLA gene types is maximized. To deal with ambiguous haplotype solutions, we use a genetic algorithm to select haplotype configurations that tend to maximize the same optimization criterion. Our experiments on a previously typed subset of the HapMap data show that the algorithm is highly accurate, achieving an accuracy of 96% for gene HLA-A, 95% for HLA-B, 97% for HLA-C, 84% for HLA-DRB1, 98% for HLA-DQA1 and 97% for HLA-DQB1 in a leave-one-out test. Conclusions Our algorithm can infer HLA gene types from neighboring SNP genotype data accurately. Compared with a recent approach on the same input data, our algorithm achieved a higher accuracy. The code of our algorithm is available to the public for free upon request to the corresponding authors

    Genome-wide inference of ancestral recombination graphs

    Get PDF
    The complex correlation structure of a collection of orthologous DNA sequences is uniquely captured by the "ancestral recombination graph" (ARG), a complete record of coalescence and recombination events in the history of the sample. However, existing methods for ARG inference are computationally intensive, highly approximate, or limited to small numbers of sequences, and, as a consequence, explicit ARG inference is rarely used in applied population genomics. Here, we introduce a new algorithm for ARG inference that is efficient enough to apply to dozens of complete mammalian genomes. The key idea of our approach is to sample an ARG of n chromosomes conditional on an ARG of n-1 chromosomes, an operation we call "threading." Using techniques based on hidden Markov models, we can perform this threading operation exactly, up to the assumptions of the sequentially Markov coalescent and a discretization of time. An extension allows for threading of subtrees instead of individual sequences. Repeated application of these threading operations results in highly efficient Markov chain Monte Carlo samplers for ARGs. We have implemented these methods in a computer program called ARGweaver. Experiments with simulated data indicate that ARGweaver converges rapidly to the true posterior distribution and is effective in recovering various features of the ARG for dozens of sequences generated under realistic parameters for human populations. In applications of ARGweaver to 54 human genome sequences from Complete Genomics, we find clear signatures of natural selection, including regions of unusually ancient ancestry associated with balancing selection and reductions in allele age in sites under directional selection. Preliminary results also indicate that our methods can be used to gain insight into complex features of human population structure, even with a noninformative prior distribution.Comment: 88 pages, 7 main figures, 22 supplementary figures. This version contains a substantially expanded genomic data analysi

    Prediction of HLA class II alleles using SNPs in an African population

    Get PDF
    BACKGROUND: Despite the importance of the human leukocyte antigen (HLA) gene locus in research and clinical practice, direct HLA typing is laborious and expensive. Furthermore, the analysis requires specialized software and expertise which are unavailable in most developing country settings. Recently, in silico methods have been developed for predicting HLA alleles using single nucleotide polymorphisms (SNPs). However, the utility of these methods in African populations has not been systematically evaluated. METHODOLOGY/PRINCIPAL FINDINGS: In the present study, we investigate prediction of HLA class II (HLA-DRB1 and HLA-DQB1) alleles using SNPs in the Wolaita population, southern Ethiopia. The subjects comprised 297 Ethiopians with genome-wide SNP data, of whom 188 had also been HLA typed and were used for training and testing the model. The 109 subjects with SNP data alone were used for empirical prediction using the multi-allelic gene prediction method. We evaluated accuracy of the prediction, agreement between predicted and HLA typed alleles, and discriminative ability of the prediction probability supplied by the model. We found that the model predicted intermediate (two-digit) resolution for HLA-DRB1 and HLA-DQB1 alleles at accuracy levels of 96% and 87%, respectively. All measures of performance showed high accuracy and reliability for prediction. The distribution of the majority of HLA alleles in the study was similar to that previously reported for the Oromo and Amhara ethnic groups from Ethiopia. CONCLUSIONS/SIGNIFICANCE: We demonstrate that HLA class II alleles can be predicted from SNP genotype data with a high level of accuracy at intermediate (two-digit) resolution in an African population. This finding offers new opportunities for HLA studies of disease epidemiology and population genetics in developing countrie

    Imputing Amino Acid Polymorphisms in Human Leukocyte Antigens

    Get PDF
    DNA sequence variation within human leukocyte antigen (HLA) genes mediate susceptibility to a wide range of human diseases. The complex genetic structure of the major histocompatibility complex (MHC) makes it difficult, however, to collect genotyping data in large cohorts. Long-range linkage disequilibrium between HLA loci and SNP markers across the major histocompatibility complex (MHC) region offers an alternative approach through imputation to interrogate HLA variation in existing GWAS data sets. Here we describe a computational strategy, SNP2HLA, to impute classical alleles and amino acid polymorphisms at class I (HLA-A, -B, -C) and class II (-DPA1, -DPB1, -DQA1, -DQB1, and -DRB1) loci. To characterize performance of SNP2HLA, we constructed two European ancestry reference panels, one based on data collected in HapMap-CEPH pedigrees (90 individuals) and another based on data collected by the Type 1 Diabetes Genetics Consortium (T1DGC, 5,225 individuals). We imputed HLA alleles in an independent data set from the British 1958 Birth Cohort (N = 918) with gold standard four-digit HLA types and SNPs genotyped using the Affymetrix GeneChip 500 K and Illumina Immunochip microarrays. We demonstrate that the sample size of the reference panel, rather than SNP density of the genotyping platform, is critical to achieve high imputation accuracy. Using the larger T1DGC reference panel, the average accuracy at four-digit resolution is 94.7% using the low-density Affymetrix GeneChip 500 K, and 96.7% using the high-density Illumina Immunochip. For amino acid polymorphisms within HLA genes, we achieve 98.6% and 99.3% accuracy using the Affymetrix GeneChip 500 K and Illumina Immunochip, respectively. Finally, we demonstrate how imputation and association testing at amino acid resolution can facilitate fine-mapping of primary MHC association signals, giving a specific example from type 1 diabetes

    Identification of rheumatoid arthritis biomarkers based on single nucleotide polymorphisms and haplotype blocks: A systematic review and meta-analysis

    Get PDF
    AbstractGenetics of autoimmune diseases represent a growing domain with surpassing biomarker results with rapid progress. The exact cause of Rheumatoid Arthritis (RA) is unknown, but it is thought to have both a genetic and an environmental bases. Genetic biomarkers are capable of changing the supervision of RA by allowing not only the detection of susceptible individuals, but also early diagnosis, evaluation of disease severity, selection of therapy, and monitoring of response to therapy. This review is concerned with not only the genetic biomarkers of RA but also the methods of identifying them. Many of the identified genetic biomarkers of RA were identified in populations of European and Asian ancestries. The study of additional human populations may yield novel results. Most of the researchers in the field of identifying RA biomarkers use single nucleotide polymorphism (SNP) approaches to express the significance of their results. Although, haplotype block methods are expected to play a complementary role in the future of that field

    The French Canadian founder population : lessons and insights for genetic epidemiological research

    Get PDF
    La population canadienne-française a une histoire dĂ©mographique unique faisant d’elle une population d’intĂ©rĂȘt pour l’épidĂ©miologie et la gĂ©nĂ©tique. Cette thĂšse vise Ă  mettre en valeur les caractĂ©ristiques de la population quĂ©bĂ©coise qui peuvent ĂȘtre utilisĂ©es afin d’amĂ©liorer la conception et l’analyse d’études d’épidĂ©miologie gĂ©nĂ©tique. Dans un premier temps, nous profitons de la prĂ©sence d’information gĂ©nĂ©alogique dĂ©taillĂ©e concernant les Canadiens français pour estimer leur degrĂ© d’apparentement et le comparer au degrĂ© d’apparentement gĂ©nĂ©tique. L’apparentement gĂ©nĂ©tique calculĂ© Ă  partir du partage gĂ©nĂ©tique identique par ascendance est corrĂ©lĂ© Ă  l’apparentement gĂ©nĂ©alogique, ce qui dĂ©montre l'utilitĂ© de la dĂ©tection des segments identiques par ascendance pour capturer l’apparentement complexe, impliquant entre autres de la consanguinitĂ©. Les conclusions de cette premiĂšre Ă©tude pourront guider l'interprĂ©tation des rĂ©sultats dans d’autres populations ne disposant pas d’information gĂ©nĂ©alogique. Dans un deuxiĂšme temps, afin de tirer profit pleinement du potentiel des gĂ©nĂ©alogies canadienne-françaises profondes, bien conservĂ©es et quasi complĂštes, nous prĂ©sentons le package R GENLIB, dĂ©veloppĂ© pour Ă©tudier de grands ensembles de donnĂ©es gĂ©nĂ©alogiques. Nous Ă©tudions Ă©galement le partage identique par ascendance Ă  l’aide de simulations et nous mettons en Ă©vidence le fait que la structure des populations rĂ©gionales peut faciliter l'identification de fondateurs importants, qui auraient pu introduire des mutations pathologiques, ce qui ouvre la porte Ă  la prĂ©vention et au dĂ©pistage de maladies hĂ©rĂ©ditaires liĂ©es Ă  certains fondateurs. Finalement, puisque nous savons que les Canadiens français ont accumulĂ© des segments homozygotes, Ă  cause de la prĂ©sence de consanguinitĂ© lointaine, nous estimons la consanguinitĂ© chez les individus canadiens-français et nous Ă©tudions son impact sur plusieurs traits de santĂ©. Nous montrons comment la dĂ©pression endogamique influence des traits complexes tels que la grandeur et des traits hĂ©matologiques. Nos rĂ©sultats ne sont que quelques exemples de ce que nous pouvons apprendre de la population canadienne-française. Ils nous aideront Ă  mieux comprendre les caractĂ©ristiques des autres populations de mĂȘme qu’ils pourront aider la recherche en Ă©pidĂ©miologie gĂ©nĂ©tique au sein de la population canadienne-française.The French Canadian founder population has a demographic history that makes it an important population for epidemiology and genetics. This work aims to explain what features can be used to improve the design and analysis of genetic epidemiological studies in the Quebec population. First we take advantage of the presence of extended genealogical records among French Canadians to estimate relatedness from those records and compare it to the genetic kinship. The kinship based on identical-by-descent sharing correlates well with the genealogical kinship, further demonstrating the usefulness of genomic identical-by-descent detection to capture complex relatedness involving inbreeding and our findings can guide the interpretation of results in other population without genealogical data. Second to optimally exploit the full potential of these well preserved, exhaustive and detailed French Canadian genealogical data we present the GENLIB R package developed to study large genealogies. We also investigate identical-by-descent sharing with simulations and highlight the fact that regional population structure can facilitate the identification of notable founders that could have introduced disease mutations, opening the door to prevention and screening of founder-related diseases. Third, knowing that French Canadians have accumulated segments of homozygous genotypes, as a result of inbreeding due to distant ancestors, we estimate the inbreeding in French Canadian individuals and investigate its impact on multiple health traits. We show how inbreeding depression influences complex traits such as height and blood-related traits. Those results are a few examples of what we can learn from the French Canadian population and will help to gain insight on other populations’ characteristics as well as help the genetic epidemiological research within the French Canadian population
    • 

    corecore