174 research outputs found

    Exome sequencing analysis of rare autosomal recessive disorders

    Get PDF
    Since the human genome project was completed in 2003, extraordinary progress has been made in the field of genomics with the development of new sequencing technologies and the widespread introduction of next generation sequencing (NGS). The application of NGS initiated a new era in genomics by massively increasing the number and diversity of the sequenced genomes at lower cost. Human Molecular Genetics has greatly benefited from the use of NGS-based strategies to identify human disease genes. In this thesis, I investigated the application of genetic techniques to investigate the molecular basis of autosomal recessively inherited disorders of unknown etiology. A range of disease phenotypes, including oligodontia and fetal akinesia/multiple pterygium syndrome (FA/MPS), were investigated in patient cohorts that included many cases with parental consanguinity. Using an autozygosity linkage analysis-based approach and Sanger sequencing of candidate genes resulted in the identification germline RYR1 mutations in FA/MPS. Subsequently, using exome sequencing techniques, the molecular basis of FA/MPS was further elucidated by the identification of germline mutations in RYR1, NEB, CHRNG, CHRNA1 and TPM2. The application of NGS in genetically heterogeneous disorders such as fetal akinesia/multiple pterygium syndrome can enable better and less expensive molecular diagnostic services aimed at specific mutation spectra, though more extensive sequencing can lead to the identification of larger numbers of variants of uncertain significance

    Large-scale data analysis to identify novel disease phenotypes and genes

    Get PDF
    Diseases can occur due to genetic changes that alter the normal function of genes. These alterations may be either inherited or acquired somatically during lifetime. Aims of this thesis work were to efficiently analyze large quantities of epidemiological and molecular data, and to characterize new susceptibility conditions and genetic causes of human diseases. First, genetic basis of right atrial isomerism (RAI) was studied in a Finnish family with five affected siblings and healthy parents. RAI is a heterotaxy syndrome with disturbances in the left-right axis development resulting in anomalies in heart and other asymmetrical organs. Linkage analysis and candidate-gene approach followed by sequencing revealed two truncating mutations in GDF1 segregating with RAI in an autosomal recessive manner. This finding, supported by the similar phenotype of laterality defects in Gdf1 knockout mice, provides evidence that RAI can be recessively inherited with GDF1 as the causative gene. Second, six patients with severe intellectual disability (ID) of unknown etiology were studied by genetic mapping and whole-genome sequencing (WGS) analysis. Autosomal recessive inheritance of severe ID was confirmed by extensive genealogy, and by linkage analysis showing high statistical significance for a homozygous region at 3p22.1-3p21.1. Three genes, TKT, P4HTM and USP4, with potentially protein damaging sequence changes were identified within the locus. The variants were rare and present only in heterozygous form in population-matched controls. This study facilitates clinical and molecular diagnosis of similar patients and further research on the role of the genes in the development of severe ID. Third, we performed WGS and transcriptome profiling of 38 uterine leiomyomas and corresponding myometrium from 30 women. Uterine leiomyomas are benign tumors that affect approximately three-quarters of women and may cause severe symptoms including abdominal pain and excessive uterine bleeding. Abundant complex chromosomal rearrangement events resembling the recently described chromothripsis phenomenon were detected. The events had created leiomyoma-specific driver changes, and occurred sequentially in some tumors. Four molecular pathways driven by alterations of MED12, FH, HMGA2/HMGA1 or COL4A5/COL4A6 were identified. The clonal origin of multiple separate tumors was also proven. The molecular genetic characterization of uterine leiomyomas will hopefully lead to better understanding of tumor growth and personalized treatment of patients. Fourth, a systematic search for familial aggregation of all tumor types was performed to identify new susceptibility phenotypes. We employed the entire population based data in the Finnish Cancer Registry and clustered 878,593 patients according to family name at birth, municipality of birth and tumor type. The rate of familial occurrence was estimated with a cluster score method. Among known cancer predisposition syndromes, Kaposi sarcoma (KS) with largely unknown genetic background was highlighted. Population records verified majority of the clustered KS patients as true relatives, providing further evidence that the clustering works well in estimating familiality. This study enabled identification of families suitable for a succeeding research on genetic basis of novel tumor predisposition phenotypes.Geneettiset muutokset, jotka muuttavat solujen normaalia toimintaa, vaikuttavat tautien syntyyn. Nämä muutokset voivat olla perittyjä tai elinaikana hankittuja. Tämän väitöskirjan tavoitteena oli (1) analysoida tehokkaasti suuria määriä epidemiologista ja molekyyligeneettistä dataa ja (2) tunnistaa uusia perinnöllisiä tauteja ja geneettisiä muutoksia tautien taustalla. Ensimmäisessä osatyössä tutkittiin oikean isomerismin perinnöllistä taustaa perheessä, jossa oli viisi sairastunutta lasta. Isomerismin taudinkuvaan kuuluu kehityshäiriöitä vasemman ja oikean puoliskon suhteen epäsymmetrisissä elimissä, etenkin sydämessä. Perheen DNA-näytteillä tehtiin kytkentäanalyysi ja taudin kanssa peittyvästi periytyviltä kromosomialueilta etsittiin kandidaattigeenejä kirjallisuutta hyödyntäen. Sairailla lapsilla löydettiin GDF1 geenissä kaksi geenin toiminnan estävää mutaatiota, joista toinen periytyi äidiltä ja toinen isältä. Samankaltainen ilmiasu oli aiemmin nähty hiirissä, joiden perimästä oli poistettu Gdf1, vahvistaen peittyvästi periytyvien GDF1 mutaatioiden kausaalisuutta oikean isomerismin synnyssä. Tulokset mahdollistavat GDF1 geenin tutkimisen, kun taudinkuvaksi epäillään oikeaa isomerismia, ja perinnöllisyysneuvonnan antamisen tarvittaessa. Toisessa osatyössä tutkittiin kuuden vaikeasti kehitysvammaisen lapsen taudin geneettistä syytä kytkentäanalyysin ja koko perimän sekvensoinnin avulla. Sukututkimuksella ja kytkentäanalyysillä todistettiin lapsien vanhempien olevan toisilleen kaukaista sukua ja löydettiin taudin aiheuttava samaperintäinen kromosomialue. Tältä alueelta tunnistettiin geenien toiminnalle mahdollisesti haitallisia perimän virheitä kolmesta eri geenistä (TKT, P4HTM ja USP4). Tutkituilta terveiltä kontrollihenkilöiltä ei koskaan löytynyt samoja geenivirheitä molemmista geenikopioista kuten tutkimuksen kehitysvammaisilla lapsilla. Tulokset mahdollistavat muiden samankaltaisten potilaiden diagnosoimisen ja geenien merkityksen tutkimisen vaikean kehitysvammaisuuden taustalla. Kolmannessa osatyössä käytettiin koko perimän laajuista sekvensointia ja geenien ilmentymistä mittaavaa menetelmää tunnistamaan hyvälaatuisten kohdun kasvainten (leiomyoomien) syntyyn ja kasvuun vaikuttavia geneettisiä muutoksia. Tutkimusaineistona oli 38 leiomyoomaa ja vastaava normaalikudosnäyte 30 potilaalta. Leiomyoomissa havaittiin monimutkaisia kromosomaalisia uudelleenjärjestymiä, jotka muistuttivat aiemmin pahanlaatuisiin syöpiin liittettyä chromothripsis ilmiötä. Monimutkaiset uudelleenjärjestymät aiheuttivat leiomyoomille tyypillisiä geneettisiä muutoksia ja kasvaimet voitiin jakaa neljään ryhmään geenien ilmentymisen perusteella. Joidenkin samassa kohdussa syntyneiden erillisten kasvainten todistettiin olevan klonaalisia. Leiomyoomien syntymekanismien tunteminen avaa toivottavasti mahdollisuuksia kohdennettujen lääkehoitojen kehittämiselle tulevaisuudessa. Neljännessä osatyössä käytettiin Suomen syöpärekisterin väestöpohjaista aineistoa tavoitteena tunnistaa uusia perheittäin esiintyviä syöpiä. Kaikkiaan 878593 syöpätapausta klusteroitiin syntymäsukunimen, -paikkakunnan ja syöpätyypin perusteella, ja perheittäistä esiintymistä arvioitiin klusteroitumisasteen mukaan. Tunnettujen perinnöllisten syöpätyyppien lisäksi muun muassa Kaposin sarkooma osoitti korkeaa klusteroitumisastetta. Sukututkimus vahvisti suuren osan klusteroituneista Kaposin sarkooma potilaista olevan sukulaisia antaen lisäuskottavuutta klusteroinnin toimivuudelle ja mahdollistaen jatkotutkimukset liittyen perheiden syöpäalttiuteen

    Genetics and tumor genomics in familial colorectal cancer

    Get PDF
    Colorectal cancer (CRC) is one of the most common cancers in the Western world and in about 30% hereditary factors play a role. Although several genetic factors that predispose families to CRC are known, in many families affected with CRC the underlying genetics remain elusive. The work described in this thesis aimed to identify novel genetic factors that lead to an increased risk for CRC in these families. Several approaches were applied, including both germ line genetic analysis and the study of genomic aberrations in colorectal carcinomas. Linkage analysis did not provide evidence for a novel high risk factor, but provided supportive evidence for a previously identified region on 3q. Enrichment of common low risk variants was observed in a cohort of familial CRC patients but not in early-onset solitary patients (without a family history of CRC). Profiling of genomic aberrations in colorectal carcinomas showed distinct profiles for different hereditary CRC syndromes: MUTYH-associated carcinomas showed high frequencies of copy-neutral LOH. Mismatch repair proficient familial carcinomas appeared to resemble the genomic profile of sporadic CRC, but with a remarkably increased frequency of 20q gain and genome-wide cnLOH.UBL - phd migration 201

    Forward vs. reverse genetics: a bovine perspective based on visible and hidden phenotypes of inherited disorders

    Get PDF
    In modern cattle production, we have seen a negative trend for decades in reproduction while productivity and performance have improved. Although considered genetically complex, part of these fecundity, fertility, and rearing success issues are caused by Mendelian monogenic disorders. Traditionally, such disorders are investigated opportunistically based on their sporadic occurrence and through subsequent targeted analysis of affected individuals. This approach is called the forward genetic approach (FGA). Modern genomic technologies, such as single nucleotide polymorphism (SNP) array genotyping and whole-genome sequencing (WGS), allow for straightforward locus mapping and the identification of candidate causal variants in affected individuals or families. Nevertheless, a major drawback is the arbitrary sampling and availability of well-phenotyped individuals for research, especially for mostly invisible defects affecting fecundity, early embryonic death, and abortions. Therefore, the reverse genetic approach (RGA) is applied to screen for underlying recessive lethal or sub-lethal variants. This approach requires the availability of massive population-wide genomic data. By applying a haplotype screen for a significant deviation of the Hardy-Weinberg equilibrium, genomic regions potentially harboring candidate causal variants are identified. The subsequent generation of WGS data of haplotype carriers allows for the mining for pathogenic variants potentially causing a reduction in homozygosity. In the first part of my thesis, I present 18 successful, 1 inconclusive example, and 1 example addressing co-dominant effects of a known disorder. These FGA analyzes include heritable skin (n=7), bone (n=7), neuromuscular (n=1), eye (n=2), as well as syndromic disorders (n=3) in various European cattle breeds. Missense and frameshift variants in the IL17RA, DSP, and FA2H genes were described in three recessive genodermatoses: immunodeficiency with psoriasis-like skin alterations, syndromic ichthyosis, and ichthyosis congenita, respectively. Hypohidrotic ectodermal dysplasia was described as X-linked disorder that is associated with a gross deletion in the EDA gene. In dominant genodermatoses, a missense variant in COL5A2 was shown to lead to classical Ehlers-Danlos syndrome, an in-frame deletion in KRT5 was shown to cause epidermolysis bullosa simplex, and results of a study using an individual case of juvenile angiomatosis remained inconclusive. A recessive disorder described as hemifacial macrosomia was associated with a missense variant in LAMB1. Chondrodysplasia in a single family was shown to be caused by a de novo mutation in the bull leading to a stop-loss of the gene FGFR3. De novo mutations (missense and large deletions) in the COL2A1 and COL1A1 genes were associated with achondrogenesis type II (bulldog calf syndrome), and osteogenesis imperfecta type II, respectively. Another mutation that we found to affect bone morphology was a trisomy in chromosome 29 leading to proportional dwarfism with facial dysplasia. Congenital neuromuscular channelopathy was for the first time associated with a missense variant in KCNG1. Furthermore, a de novo missense variant in ADAMTSL4 and a recessive missense variant in CNGB3 were shown to cause congenital cataract and achromatopsia, respectively. Additionally, cases of pulmonary hypoplasia and anasarca syndrome were analyzed and shown to be caused by trisomy 20 in two unrelated calves and a recessively inherited missense variant in ADAMTS3. Moreover, the fatal syndromic disorder skeletal-cardo-enteric dysplasia was described to be caused by a de novo missense variant in MAP2K2. Finally, I investigated the effects on blood cholesterol and triglyceride levels of heterozygous carriers of the previously described APOB-related cholesterol deficiency. In the second part of my thesis, I present the outcome of the RGA in four main Swiss populations, that was validated with the SWISScow custom array. In the Brown Swiss dairy population, 72 haplotype regions showed significant depletions in homozygosity. Four of these haplotypes (BH6, BH14, BH24, and BH34) were associated with missense and nonsense variants in different genes (MARS2, MRPL55, CPT1C, and ACSL5, respectively). In the Original Braunvieh population, eight haplotype regions were identified. Candidate causal variants included a missense variant in TUBGCP5 gene associated with haplotype OH2, and a splice site frameshift variant in LIG3 gene associated with haplotype OH4. In the Holstein population, 24 haplotype regions were identified with a significant reduction of homozygosity. Subsequently, four novel candidate variants were proposed: a nonsense variant in KIR2DS1 for haplotype HH13, in-frame deletion in the genes NOTCH3 for HH21 haplotype, and RIOX1 for HH25 haplotype, and finally, a missense variant in PCDH15 for HH35 haplotype. In the Simmental population, eleven haplotype regions were detected. The haplotype SH5 was associated with a frameshift variant in DIS3 gene and the haplotypes SH8 and SH9 with missense variants in the CYP2B6 and NUBPL genes, respectively. For the breeds Brown Swiss, Original Braunvieh, and Holstein, association studies were carried out including traits describing fertility, birth, growth, and survival. Thereby most of the described mentioned haplotypes show additive effects. Regardless of the approach, all the described candidate causal variants can be used as a tool of precision diagnostics and represent a step forward towards personalized medicine in cattle. Furthermore, these variants can be easily genotyped and allow for targeted breeding to reduce the number of risk matings, which would lead to a reduction of affected animals and significant improvement in animal health and welfare

    Computational Identification of Recessive Mutations in Cancers using High Throughput SNP-arrays

    Get PDF
    This thesis presents a highly sensitive genome wide search method for recessive mutations. The method is suitable for distantly related samples that are divided into phenotype positives and negatives. High throughput genotype arrays are used to identify and compare homozygous regions between the cohorts. The method is demonstrated by comparing colorectal cancer patients against unaffected references. The objective is to find homozygous regions and alleles that are more common in cancer patients. We have designed and implemented software tools to automate the data analysis from genotypes to lists of candidate genes and to their properties. The programs have been designed in respect to a pipeline architecture that allows their integration to other programs such as biological databases and copy number analysis tools. The integration of the tools is crucial as the genome wide analysis of the cohort differences produces many candidate regions not related to the studied phenotype. CohortComparator is a genotype comparison tool that detects homozygous regions and compares their loci and allele constitutions between two sets of samples. The data is visualised in chromosome specific graphs illustrating the homozygous regions and alleles of each sample. The genomic regions that may harbour recessive mutations are emphasised with different colours and a scoring scheme is given for these regions. The detection of homozygous regions, cohort comparisons and result annotations are all subjected to presumptions many of which have been parameterized in our programs. The effect of these parameters and the suitable scope of the methods have been evaluated. Samples with different resolutions can be balanced with the genotype estimates of their haplotypes and they can be used within the same study

    Aineistojen yhdistämismenetelmiä genominlaajuisten syöpäaineistojen tulkintaan

    Get PDF
    The genetic alterations of cancer cells vary between individuals and during the progression of the disease. The advances in measurement techniques have enabled genome-scale profiling of mutations, transcription, and DNA methylation. These methods can be used to address the complexity of the disease but also raise an acute demand for the analysis of the high dimensional data sets produced. An integrative and scalable computational infrastructure is advantageous in cancer research. First, a multitude of programs and analytic steps are needed when integrating various measurement types. An efficient execution and management of such projects saves time and reduces the probability of mistakes. Second, new information and methods can be utilised with a minor effort of re-executing the workflow. Third, a formal description of the program interfaces and the workflows aids collaboration, testing, and reuse of the work done. Fourth, the number of samples available is often small in comparison with the unknown variables, such as possibly affected genes, of interest. The interpretation of new measurements in the context of existing information may limit the number of false positives when sensitive methods are needed. We have introduced new computational methods for the data integration and for the management of large and heterogeneous data sets. The suitability of the methods has been demonstrated with four cancer studies covering a wide spectrum of data from population genetics to the details of the transcriptional regulation of proteins, such as androgen receptor and forkhead box protein A1. The repeatable workflows established for these colorectal cancer, glioblastoma, and prostate cancer studies have been used to maintain up-to-date registries of results for follow-up studies.Syöpäsolujen geneettiset muutokset vaihtelevat potilaittain ja taudin edetessä. Mittausmenetelmien kehittyminen on mahdollistanut mutaatioiden, transkription, sekä DNA-metylaation genominlaajuisen kartoittamisen. Genomin kattavia menetelmiä voidaan käyttää monitekijäisten syöpäsairauksien tutkimuksessa, mutta niiden myötä on syntynyt tarve moniulotteisen tiedon tarkasteluun soveltuville menetelmille. Joitakin syöpätutkimukseen liittyviä haasteita voidaan ratkaista yhdistävällä ja skaalautuvalla laskennallisella infrastruktuurilla. Ensimmäiseksi, erilaisten mittausten yhdistämiseen tarvitaan useita sovelluksia ja tarkasteluvaiheita. Kokonaisuuden automatisoitu suoritus ja hallinta säästävät aikaa ja pienentävät virheiden mahdollisuutta. Toiseksi, uutta tietoa ja menetelmiä päästään hyödyntämään pienellä vaivalla uudelleen suorittamalla työnkulku. Kolmanneksi, ohjelmistorajapintojen ja työnkulkujen määrämuotoinen kuvaus helpottavat yhteistyötä, testausta ja tehdyn työn uudelleenkäyttöä. Neljänneksi, saatavilla olevien näytteiden lukumäärä on usein pieni verrattuna kiinnostuksen kohteena oleviin tuntemattomiin muuttujiin, kuten mahdollisesti vioittuneisiin geeneihin. Uusien mittausten tulkinta olemassa olevan tiedon yhteydessä saattaa vähentää väärien positiivisten määrää kun tarvitaan herkkiä menetelmiä. Olemme esitelleet uusia laskennallisia menetelmiä tiedon yhdistelyyn, sekä laajojen ja vaihtelevan muotoisten aineistojen käsittelyyn. Menetelmien käyttökelpoisuutta olemme havainnollistaneet soveltamalla niitä neljässä syöpätutkimuksessa, jotka liittyvät paksunsuolen syöpään, glioblastoomaan ja eturauhassyöpään. Tutkimusten aihealueet kattavat kirjon populaatiogenetiikasta transkriptiotekijöiden, kuten androgeenireseptorin ja FoxA1:n toiminnan, yksityiskohtiin. Tutkimusten puitteissa toistettavaan muotoon rakennetut työnkulut ovat tuloksineen tarjonneet ajantasaisen tietolähteen pohjaksi jatkotutkimuksille

    Mining downy mildew susceptibility genes: a diversity study in grapevine

    Get PDF
    Several pathogens continuously threaten viticulture worldwide. Until now, the investigation on resistance loci has been the main trend to understand the interaction between grapevine and mildew causal agents. Dominantly inherited gene-based resistance has shown to be race-specific in some cases, to confer partial immunity and to be potentially overcome within a few years since its introgression. Recently, on the footprint of research conducted on Arabidopsis, the putative hortologues of genes associated with downy mildew susceptibility in this species, have been discovered also in the grapevine genome. In this work, we deep-resequenced four putative susceptibility genes in 190 highly genetically diverse grapevine genotypes to discover new sources of broad-spectrum recessively inherited resistance. The scouted genes are VvDMR6-1, VvDMR6-2, VvDLO1, VvDLO2 and predicted to be involved in susceptibility to downy mildew. From all identified mutations, 56% were Single Nucleotide Polymorphisms (SNPs) in heterozygosity, while the remaining 44% were homozygous. Regarding the identified mutations with putative impact on gene function, we observed ~4% genotypes mutated in VvDMR6-1 and ~8% mutated in VvDMR6-2, only a handful of genotypes that were mutated in both genes. ~2% and ~7% genotypes showed mutations in VvDLO1 and VvDLO2 respectively, and again a few genotypes resulted mutated in both genes. In particular, 80% of impacting mutations were heterozygous while 20% were homozygous. The current results will inform grapevine genetics and corroborate genomic-assisted breeding programs for resistance to biotic stresses

    Mining grapevine downy mildew susceptibility genes: A resource for genomics-based breeding and tailored gene editing

    Get PDF
    Several pathogens continuously threaten viticulture worldwide. Until now, the investigation on resistance loci has been the main trend to understand the interaction between grapevine and the mildew causal agents. Dominantly inherited gene-based resistance has shown to be race-specific in some cases, to confer partial immunity, and to be potentially overcome within a few years since its introgression. Recently, on the footprint of research conducted in Arabidopsis, putative genes associated with downy mildew susceptibility have been discovered also in the grapevine genome. In this work, we deep-sequenced four putative susceptibility genes\u2014namely VvDMR6.1, VvDMR6.2, VvDLO1, VvDLO2\u2014in 190 genetically diverse grapevine genotypes to discover new sources of broad-spectrum and recessively inherited resistance. Identified Single Nucleotide Polymorphisms were screened in a bottleneck analysis from the genetic sequence to their impact on protein structure. Fifty-five genotypes showed at least one impacting mutation in one or more of the scouted genes. Haplotypes were inferred for each gene and two of them at the VvDMR6.2 gene were found significantly more represented in downy mildew resistant genotypes. The current results provide a resource for grapevine and plant genetics and could corroborate genomic-assisted breeding programs as well as tailored gene editing approaches for resistance to biotic stresses

    Mining grapevine downy mildew susceptibility genes: a resource for genomics-based breeding and tailored gene editing

    Get PDF
    Several pathogens continuously threaten viticulture worldwide. Until now, the investigation on resistance loci has been the main trend to understand the interaction between grapevine and the mildew causal agents. Dominantly inherited gene-based resistance has shown to be race-specific in some cases, to confer partial immunity, and to be potentially overcome within a few years since its introgression. Recently, on the footprint of research conducted in Arabidopsis, putative genes associated with downy mildew susceptibility have been discovered also in the grapevine genome. In this work, we deep-sequenced four putative susceptibility genes—namely VvDMR6.1, VvDMR6.2, VvDLO1, VvDLO2—in 190 genetically diverse grapevine genotypes to discover new sources of broad-spectrum and recessively inherited resistance. Identified Single Nucleotide Polymorphisms were screened in a bottleneck analysis from the genetic sequence to their impact on protein structure. Fifty-five genotypes showed at least one impacting mutation in one or more of the scouted genes. Haplotypes were inferred for each gene and two of them at the VvDMR6.2 gene were found significantly more represented in downy mildew resistant genotypes. The current results provide a resource for grapevine and plant genetics and could corroborate genomic-assisted breeding programs as well as tailored gene editing approaches for resistance to biotic stresse
    corecore