    Genotyping for genetic association studies : methods and applications

    In this thesis, two separate single nucleotide polymorphism (SNP) genotyping techniques were set up at the Finnish Genome Center, pooled genotyping was evaluated as a screening method for large-scale association studies, and finally, the former approaches were used to identify genetic factors predisposing to two distinct complex diseases by utilizing large epidemiological cohorts and also taking environmental factors into account. The first genotyping platform was based on traditional but improved restriction-fragment-length-polymorphism (RFLP) utilizing 384-microtiter well plates, multiplexing, small reaction volumes (5 µl), and automated genotype calling. We participated in the development of the second genotyping method, based on single nucleotide primer extension (SNuPeTM by Amersham Biosciences), by carrying out the alpha- and beta tests for the chemistry and the allele-calling software. Both techniques proved to be accurate, reliable, and suitable for projects with thousands of samples and tens of markers. Pooled genotyping (genotyping of pooled instead of individual DNA samples) was evaluated with Sequenom s MassArray MALDI-TOF, in addition to SNuPeTM and PCR-RFLP techniques. We used MassArray mainly as a point of comparison, because it is known to be well suited for pooled genotyping. All three methods were shown to be accurate, the standard deviations between measurements being 0.017 for the MassArray, 0.022 for the PCR-RFLP, and 0.026 for the SNuPeTM. The largest source of error in the process of pooled genotyping was shown to be the volumetric error, i.e., the preparation of pools. We also demonstrated that it would have been possible to narrow down the genetic locus underlying congenital chloride diarrhea (CLD), an autosomal recessive disorder, by using the pooling technique instead of genotyping individual samples. Although the approach seems to be well suited for traditional case-control studies, it is difficult to apply if any kind of stratification based on environmental factors is needed. Therefore we chose to continue with individual genotyping in the following association studies. Samples in the two separate large epidemiological cohorts were genotyped with the PCR-RFLP and SNuPeTM techniques. The first of these association studies concerned various pregnancy complications among 100,000 consecutive pregnancies in Finland, of which we genotyped 2292 patients and controls, in addition to a population sample of 644 blood donors, with 7 polymorphisms in the potentially thrombotic genes. In this thesis, the analysis of a sub-study of pregnancy-related venous thromboses was included. We showed that the impact of factor V Leiden polymorphism on pregnancy-related venous thrombosis, but not the other tested polymorphisms, was fairly large (odds ratio 11.6; 95% CI 3.6-33.6), and increased multiplicatively when combined with other risk factors such as obesity or advanced age. Owing to our study design, we were also able to estimate the risks at the population level. The second epidemiological cohort was the Helsinki Birth Cohort of men and women who were born during 1924-1933 in Helsinki. The aim was to identify genetic factors that might modify the well known link between small birth size and adult metabolic diseases, such as type 2 diabetes and impaired glucose tolerance. Among ~500 individuals with detailed birth measurements and current metabolic profile, we found that an insertion/deletion polymorphism of the angiotensin converting enzyme (ACE) gene was associated with the duration of gestation, and weight and length at birth. Interestingly, the ACE insertion allele was also associated with higher indices of insulin secretion (p=0.0004) in adult life, but only among individuals who were born small (those among the lowest third of birth weight). Likewise, low birth weight was associated with higher indices of insulin secretion (p=0.003), but only among carriers of the ACE insertion allele. The association with birth measurements was also found with a common haplotype of the glucocorticoid receptor (GR) gene. Furthermore, the association between short length at birth and adult impaired glucose tolerance was confined to carriers of this haplotype (p=0.007). These associations exemplify the interaction between environmental factors and genotype, which, possibly due to altered gene expression, predisposes to complex metabolic diseases. Indeed, we showed that the common GR gene haplotype associated with reduced mRNA expression in thymus of three individuals (p=0.0002).Yhden nukleotidin muutos (SNP; single nucleotide polymorphism) on yleisin yksilöiden välistä vaihtelua aiheuttava tekijä ihmisen perimässä. Siksi SNP:t ovat käyttökelpoisimpia geenimerkkejä monitekijäisille taudeille altistavien geenien tunnistamisessa, ja SNP-genotyyppausmenetelmien kehittäminen on viime vuosina ollut tärkeää. Monitekijäisten tautien alttiusgeenien tunnistaminen on kuitenkin osoittautunut vaikeaksi, koska yhden geenin sijaan taustalla saattaa olla useampi geeni sekä geenien ja ympäristötekijöiden yhteisvaikutukset. Tässä väitöskirjatyössä pystytettiin sekä restriktioentsyymipilkontaan että minisekvensointiin perustuvat SNP-genotyyppausmenetelmät Suomen genomikeskukseen, jotta suuren mittakaavan genotyyppaus olisi mahdollista. Näiden menetelmien toimivuutta arvioitiin sovelluksessa, jossa alleelien esiintyvyys määritetään yhdistetyistä DNA-näytteistä yksittäisten DNA-näytteiden sijaan. Lopuksi menetelmiä käytettiin monitekijäisille taudeille altistavien geneettisten tekijöiden tunnistamiseksi kahdessa erillisessä tutkimuksessa, joissa myös ympäristötekijöiden vaikutus huomioitiin. Alleelifrekvenssien määrittäminen yhdistetyistä DNA-näytteistä voi säästää merkittävästi genotyyppauskuluja sekä työhön käytettyä aikaa. Totesimme testattujen genotyyppausmenetelmien toimivan tarkasti tässä sovelluksessa. Esimerkkitautina käytimme kloridiripulia, jonka geenivirhe on tunnettu. Määritimme tautigeenin alueella sijaitsevien SNP:ien alleelifrekvenssit tapaus-, verrokki- ja kantajaryhmistä. Suurin ja tilastollisesti erittäin merkitsevä alleelifrekvenssiero tapausten ja verrokkien välillä havaittiin tautigeeniä ympäröivillä SNP:eillä. Kloridiripuliin assosioituva alue perimässä olisi näin ollen saatu rajattua hyvin kapeaksi käyttämällä yksittäisten näytteiden sijaan yhdistettyjä näytteitä. Vaikka sovellus toimii hyvin perinteisessä tapaus-verrokki -asetelmassa, sitä on vaikea soveltaa tutkimukseen, jossa myös ympäristötekijöiden vaikutus huomioidaan. Siksi seuraavissa assosiaatiotutkimuksissa käytimme yksittäisiä DNA-näytteitä. Restriktioentsyymipilkontaan perustuvaa menetelmää käytimme osatyössä, jonka tavoitteena oli selvittää sekä geneettisten että hankinnaisten altistavien tekijöiden aiheuttaman laskimotukosvaaran suuruus raskauden ja lapsivuoteuden aikana. Tutkimus on osa suurempaa kokonaisuutta, jossa tarkastellaan 100 000 perättäistä raskautta ja niiden aikana tapahtuneita raskauskomplikaatioita. Tutkituista geneettisistä tekijöistä ainoastaan FV Leiden osoittautui erittäin voimakkaaksi laskimotukosriskitekijäksi raskauden aikana. FV Leiden yhdessä muiden altistavien tekijöiden, kuten korkean iän ja ylipainon kanssa, nosti raskauden aikaisen laskimotukosriskin hälyttävän korkeaksi. Käytetyn tutkimusasetelman ansiosta pystyimme myös laskemaan aikaisemmista tutkimuksista usein puuttuneet väestötason riskit. Pienen syntymäkoon on osoitettu altistavan aikuisiän sairauksille, kuten tyypin 2 diabetekselle sekä sydän- ja verisuonitaudeille. Ilmiö on selitetty mm. sikiön ohjelmoitumisella: epäsuotuisat olosuhteet jo sikiöaikana muodostavat riskin sairastua aikuisiällä. Pienen syntymäkoon ja aikuisiän sairauksien välinen yhteys on toisaalta selitetty yhteisillä geneettisillä tekijöillä. Tavoitteenamme oli tutkia tämän ilmiön mekanismeja etsimällä geenimuotoja jotka muokkaavat pienen syntymäkoon merkitsemää alttiutta sairastua mm. tyypin 2 diabetekseen ja heikentyneeseen glukoosinsietoon. Käytetyt genotyyppausmenetelmät perustuivat PCR:ään ja minisekvensointiin. Kliinisiin tutkimuksiin osallistui 500 naista ja miestä, jotka olivat syntyneet vuosina 1924 1933 Helsingissä, ja joiden tarkat syntymämitat olivat tiedossa. Osoitimme ACE-geenin insertio/deleetio-polymorfismin assosioituvan raskauden kestoon sekä syntymäpainoon ja -pituuteen. Lisäksi insertio-alleeli assosioitui mm. lisääntyneeseen insuliinin eritykseen aikuisiällä, ainoastaan kuitenkin yksilöillä jotka olivat syntyneet pienikokoisina. Syntymämittoihin assosioitui myös yleinen GR-haplotyyppi, joka lisäksi assosioitui kohonneeseen paastokortisoli- ja glukoosipitoisuuteen sekä heikentyneeseen glukoosinsietoon, jälleen ainoastaan pienikokoisina syntyneillä. Osoitimme kyseisen haplotyypin assosioituvan alentuneeseen GR-geenin ilmentymiseen. Nämä löydökset osoittavat mielenkiintoisen vuorovaikutuksen geneettisten- ja ympäristötekijöiden välillä

    Risk of pneumococcal bacteremia in Kenyan children with glucose-6-phosphate dehydrogenase deficiency

    Background Glucose-6-phosphate dehydrogenase (G6PD) deficiency is the most common enzyme deficiency state in humans. The clinical phenotype is variable and includes asymptomatic individuals, episodic hemolysis induced by oxidative stress, and chronic hemolysis. G6PD deficiency is common in malaria-endemic regions, an observation hypothesized to be due to balancing selection at the G6PD locus driven by malaria. G6PD deficiency increases risk of severe malarial anemia, a key determinant of invasive bacterial disease in malaria-endemic settings. The pneumococcus is a leading cause of invasive bacterial infection and death in African children. The effect of G6PD deficiency on risk of pneumococcal disease is undefined. We hypothesized that G6PD deficiency increases pneumococcal disease risk and that this effect is dependent upon malaria. Methods We performed a genetic case-control study of pneumococcal bacteremia in Kenyan children stratified across a period of falling malaria transmission between 1998 and 2010. Results Four hundred twenty-nine Kenyan children with pneumococcal bacteremia and 2677 control children were included in the study. Among control children, G6PD deficiency, secondary to the rs1050828 G>A mutation, was common, with 11.2% (n = 301 of 2677) being hemi- or homozygotes and 33.3% (n = 442 of 1329) of girls being heterozygotes. We found that G6PD deficiency increased the risk of pneumococcal bacteremia, but only during a period of high malaria transmission (P = 0.014; OR 2.33, 95% CI 1.19-4.57). We estimate that the population attributable fraction of G6PD deficiency on risk of pneumococcal bacteremia in areas under high malaria transmission is 0.129. Conclusions Our data demonstrate that G6PD deficiency increases risk of pneumococcal bacteremia in a manner dependent on malaria. At the population level, the impact of G6PD deficiency on invasive pneumococcal disease risk in malaria-endemic regions is substantial. Our study highlights the infection-associated morbidity and mortality conferred by G6PD deficiency in malaria-endemic settings and adds to our understanding of the potential indirect health benefits of improved malaria control.Peer reviewe

    Dietary fatty acid intake in childhood and the risk of islet autoimmunity and type 1 diabetes : the DIPP birth cohort study

    Publisher Copyright: © 2022, The Author(s).Purpose The aim was to study the associations between dietary intake of fatty acids in childhood and the risk of islet autoimmunity and type 1 diabetes (T1D). Methods The prospective Finnish Type 1 Diabetes Prediction and Prevention (DIPP) Study included children with genetic susceptibility to T1D born between 1996 and 2004. Participants were followed up every 3 to 12 months up to 6 years for diet, islet autoantibodies, and T1D. Dietary intake of several fatty acids at the age of 3 months to 6 years was assessed 1-8 times per participant with a 3-day food record. Joint models adjusted for energy intake, sex, HLA genotype and familial diabetes were used to investigate the associations of longitudinal intake of fatty acids and the development of islet autoimmunity and T1D. Results During the 6-year follow-up, 247 (4.4%) children of 5626 developed islet autoimmunity and 94 (1.7%) children of 5674 developed T1D. Higher intake of monounsaturated fatty acids (HR 0.63; 95% CI 0.47, 0.82), arachidonic acid (0.69; 0.50, 0.94), total n-3 fatty acids (0.64; 0.48, 0.84), and long-chain n-3 fatty acids (0.14; 0.04, 0.43), was associated with a decreased risk of islet autoimmunity with and without energy adjustment. Higher intake of total fat (0.73; 0.53, 0.98), and saturated fatty acids (0.55; 0.33, 0.90) was associated with a decreased risk of T1D only when energy adjusted. Conclusion Intake of several fatty acids was associated with a decreased risk of islet autoimmunity or T1D among high-risk children. Our findings support the idea that dietary factors, including n-3 fatty acids, may play a role in the disease process of T1D.Peer reviewe

    Variants in the Mannose-binding Lectin Gene MBL2 do not Associate With Sepsis Susceptibility or Survival in a Large European Cohort

    We use a large cohort of immune competent adults to analyze the influence of MBL2 genetic variants on sepsis susceptibility and survival. We find no significant associations with the 4 main functional single nucleotide polymorphisms in MBL2, or any combination of genotype

    The correlation between reading and mathematics ability at age twelve has a substantial genetic component

    Dissecting how genetic and environmental influences impact on learning is helpful for maximizing numeracy and literacy. Here we show, using twin and genome-wide analysis, that there is a substantial genetic component to children’s ability in reading and mathematics, and estimate that around one half of the observed correlation in these traits is due to shared genetic effects (so-called Generalist Genes). Thus, our results highlight the potential role of the learning environment in contributing to differences in a child’s cognitive abilities at age twelve

    Risk of nontyphoidal Salmonella bacteraemia in African children is modified by STAT4

    Nontyphoidal Salmonella (NTS) is a major cause of bacteraemia in Africa. The disease typically affects HIV-infected individuals and young children, causing substantial morbidity and mortality. Here we present a genome-wide association study (180 cases, 2677 controls) and replication analysis of NTS bacteraemia in Kenyan and Malawian children. We identify a locus in STAT4, rs13390936, associated with NTS bacteraemia. rs13390936 is a context-specific expression quantitative trait locus for STAT4 RNA expression, and individuals carrying the NTS-risk genotype demonstrate decreased interferon-gamma (IFN gamma) production in stimulated natural killer cells, and decreased circulating IFN gamma concentrations during acute NTS bacteraemia. The NTS-risk allele at rs13390936 is associated with protection against a range of autoimmune diseases. These data implicate interleukin-12-dependent IFN gamma-mediated immunity as a determinant of invasive NTS disease in African children, and highlight the shared genetic architecture of infectious and autoimmune disease.Peer reviewe

    Genome-wide association study identifies a variant in HDAC9 associated with large vessel ischemic stroke

    Genetic factors have been implicated in stroke risk but few replicated associations have been reported. We conducted a genome-wide association study (GWAS) in ischemic stroke and its subtypes in 3,548 cases and 5,972 controls, all of European ancestry. Replication of potential signals was performed in 5,859 cases and 6,281 controls. We replicated reported associations between variants close to PITX2 and ZFHX3 with cardioembolic stroke, and a 9p21 locus with large vessel stroke. We identified a novel association for a SNP within the histone deacetylase 9(HDAC9) gene on chromosome 7p21.1 which was associated with large vessel stroke including additional replication in a further 735 cases and 28583 controls (rs11984041, combined P = 1.87×10−11, OR=1.42 (95% CI) 1.28-1.57). All four loci exhibit evidence for heterogeneity of effect across the stroke subtypes, with some, and possibly all, affecting risk for only one subtype. This suggests differing genetic architectures for different stroke subtypes

