352 research outputs found

    Discerning the ancestry of European Americans in genetic association studies

    Get PDF
    European Americans are often treated as a homogeneous group, but in fact form a structured population due to historical immigration of diverse source populations. Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries. Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure. Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries. We demonstrate that this panel of markers can be used to correct for stratification in association studies that do not generate dense genotype data

    Detection of regulator genes and eQTLs in gene networks

    Full text link
    Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in non-coding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are "expression quantitative trait loci" or eQTLs for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, and to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and softwares to identify eQTLs and their associated genes, to reconstruct co-expression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.Comment: minor revision with typos corrected; review article; 24 pages, 2 figure

    New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk.

    Get PDF
    Levels of circulating glucose are tightly regulated. To identify new loci influencing glycemic traits, we performed meta-analyses of 21 genome-wide association studies informative for fasting glucose, fasting insulin and indices of beta-cell function (HOMA-B) and insulin resistance (HOMA-IR) in up to 46,186 nondiabetic participants. Follow-up of 25 loci in up to 76,558 additional subjects identified 16 loci associated with fasting glucose and HOMA-B and two loci associated with fasting insulin and HOMA-IR. These include nine loci newly associated with fasting glucose (in or near ADCY5, MADD, ADRA2A, CRY2, FADS1, GLIS3, SLC2A2, PROX1 and C2CD4B) and one influencing fasting insulin and HOMA-IR (near IGF1). We also demonstrated association of ADCY5, PROX1, GCK, GCKR and DGKB-TMEM195 with type 2 diabetes. Within these loci, likely biological candidate genes influence signal transduction, cell proliferation, development, glucose-sensing and circadian regulation. Our results demonstrate that genetic studies of glycemic traits can identify type 2 diabetes risk loci, as well as loci containing gene variants that are associated with a modest elevation in glucose levels but are not associated with overt diabetes

    Hundreds of variants clustered in genomic loci and biological pathways affect human height

    Get PDF
    Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits, but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P < 0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.

    Analysis of genome-wide structure, diversity and fine mapping of Mendelian traits in traditional and village chickens

    Get PDF
    Extensive phenotypic variation is a common feature among village chickens found throughout much of the developing world, and in traditional chicken breeds that have been artificially selected for traits such as plumage variety. We present here an assessment of traditional and village chicken populations, for fine mapping of Mendelian traits using genome-wide single-nucleotide polymorphism (SNP) genotyping while providing information on their genetic structure and diversity. Bayesian clustering analysis reveals two main genetic backgrounds in traditional breeds, Kenyan, Ethiopian and Chilean village chickens. Analysis of linkage disequilibrium (LD) reveals useful LD (r(2)⩾0.3) in both traditional and village chickens at pairwise marker distances of ∼10 Kb; while haplotype block analysis indicates a median block size of 11–12 Kb. Association mapping yielded refined mapping intervals for duplex comb (Gga 2:38.55–38.89 Mb) and rose comb (Gga 7:18.41–22.09 Mb) phenotypes in traditional breeds. Combined mapping information from traditional breeds and Chilean village chicken allows the oocyan phenotype to be fine mapped to two small regions (Gga 1:67.25–67.28 Mb, Gga 1:67.28–67.32 Mb) totalling ∼75 Kb. Mapping the unmapped earlobe pigmentation phenotype supports previous findings that the trait is sex-linked and polygenic. A critical assessment of the number of SNPs required to map simple traits indicate that between 90 and 110K SNPs are required for full genome-wide analysis of haplotype block structure/ancestry, and for association mapping in both traditional and village chickens. Our results demonstrate the importance and uniqueness of phenotypic diversity and genetic structure of traditional chicken breeds for fine-scale mapping of Mendelian traits in the species, with village chicken populations providing further opportunities to enhance mapping resolutions

    Scanning and filling : ultra-dense SNP genotyping combining genotyping-by-sequencing, SNP array and whole-genome resequencing data

    Get PDF
    Genotyping-by-sequencing (GBS) represents a highly cost-effective high-throughput genotyping approach. By nature, however, GBS is subject to generating sizeable amounts of missing data and these will need to be imputed for many downstream analyses. The extent to which such missing data can be tolerated in calling SNPs has not been explored widely. In this work, we first explore the use of imputation to fill in missing genotypes in GBS datasets. Importantly, we use whole genome resequencing data to assess the accuracy of the imputed data. Using a panel of 301 soybean accessions, we show that over 62,000 SNPs could be called when tolerating up to 80% missing data, a five-fold increase over the number called when tolerating up to 20% missing data. At all levels of missing data examined (between 20% and 80%), the resulting SNP datasets were of uniformly high accuracy (96– 98%). We then used imputation to combine complementary SNP datasets derived from GBS and a SNP array (SoySNP50K). We thus produced an enhanced dataset of >100,000 SNPs and the genotypes at the previously untyped loci were again imputed with a high level of accuracy (95%). Of the >4,000,000 SNPs identified through resequencing 23 accessions (among the 301 used in the GBS analysis), 1.4 million tag SNPs were used as a reference to impute this large set of SNPs on the entire panel of 301 accessions. These previously untyped loci could be imputed with around 90% accuracy. Finally, we used the 100K SNP dataset (GBS + SoySNP50K) to perform a GWAS on seed oil content within this collection of soybean accessions. Both the number of significant marker-trait associations and the peak significance levels were improved considerably using this enhanced catalog of SNPs relative to a smaller catalog resulting from GBS alone at 20% missing data. Our results demonstrate that imputation can be used to fill in both missing genotypes and untyped loci with very high accuracy and that this leads to more powerful genetic analyses

    Polymorphisms in genes of interleukin 12 and its receptors and their association with protection against severe malarial anaemia in children in western Kenya

    Get PDF
    Abstract Background: Malarial anaemia is characterized by destruction of malaria infected red blood cells and suppression of erythropoiesis. Interleukin 12 (IL12) significantly boosts erythropoietic responses in murine models of malarial anaemia and decreased IL12 levels are associated with severe malarial anaemia (SMA) in children. Based on the biological relevance of IL12 in malaria anaemia, the relationship between genetic polymorphisms of IL12 and its receptors and SMA was examined. Methods: Fifty-five tagging single nucleotide polymorphisms covering genes encoding two IL12 subunits, IL12A and IL12B, and its receptors, IL12RB1 and IL12RB2, were examined in a cohort of 913 children residing in Asembo Bay region of western Kenya. Results: An increasing copy number of minor variant (C) in IL12A (rs2243140) was significantly associated with a decreased risk of SMA (P = 0.006; risk ratio, 0.52 for carrying one copy of allele C and 0.28 for two copies). Individuals possessing two copies of a rare variant (C) in IL12RB1 (rs429774) also appeared to be strongly protective against SMA (P = 0.00005; risk ratio, 0.18). In addition, children homozygous for another rare allele (T) in IL12A (rs22431348) were associated with reduced risk of severe anaemia (SA) (P = 0.004; risk ratio, 0.69) and of severe anaemia with any parasitaemia (SAP) (P = 0.004; risk ratio, 0.66). In contrast, AG genotype for another variant in IL12RB1 (rs383483) was associated with susceptibility to high-density parasitaemia (HDP) (P = 0.003; risk ratio, 1.21). Conclusions: This study has shown strong associations between polymorphisms in the genes of IL12A and IL12RB1 and protection from SMA in Kenyan children, suggesting that human genetic variants of IL12 related genes may significantly contribute to the development of anaemia in malaria patients

    Genetic risk factors for cerebrovascular disease in children with sickle cell disease: design of a case-control association study and genomewide screen

    Get PDF
    BACKGROUND: The phenotypic heterogeneity of sickle cell disease is likely the result of multiple genetic factors and their interaction with the sickle mutation. High transcranial doppler (TCD) velocities define a subgroup of children with sickle cell disease who are at increased risk for developing ischemic stroke. The genetic factors leading to the development of a high TCD velocity (i.e. cerebrovascular disease) and ultimately to stroke are not well characterized. METHODS: We have designed a case-control association study to elucidate the role of genetic polymorphisms as risk factors for cerebrovascular disease as measured by a high TCD velocity in children with sickle cell disease. The study will consist of two parts: a candidate gene study and a genomewide screen and will be performed in 230 cases and 400 controls. Cases will include 130 patients (TCD ≥ 200 cm/s) randomized in the Stroke Prevention Trial in Sickle Cell Anemia (STOP) study as well as 100 other patients found to have high TCD in STOP II screening. Four hundred sickle cell disease patients with a normal TCD velocity (TCD < 170 cm/s) will be controls. The candidate gene study will involve the analysis of 28 genetic polymorphisms in 20 candidate genes. The polymorphisms include mutations in coagulation factor genes (Factor V, Prothrombin, Fibrinogen, Factor VII, Factor XIII, PAI-1), platelet activation/function (GpIIb/IIIa, GpIb IX-V, GpIa/IIa), vascular reactivity (ACE), endothelial cell function (MTHFR, thrombomodulin, VCAM-1, E-Selectin, L-Selectin, P-Selectin, ICAM-1), inflammation (TNFα), lipid metabolism (Apo A1, Apo E), and cell adhesion (VCAM-1, E-Selectin, L-Selectin, P-Selectin, ICAM-1). We will perform a genomewide screen of validated single nucleotide polymorphisms (SNPs) in pooled DNA samples from 230 cases and 400 controls to study the possible association of additional polymorphisms with the high-risk phenotype. High-throughput SNP genotyping will be performed through MALDI-TOF technology using Sequenom's MassARRAY™ system. DISCUSSION: It is expected that this study will yield important information on genetic risk factors for the cerebrovascular disease phenotype in sickle cell disease by clarifying the role of candidate genes in the development of high TCD. The genomewide screen for a large number of SNPs may uncover the association of novel polymorphisms with cerebrovascular disease and stroke in sickle cell disease
    corecore