557 research outputs found

    Beta Thalassemia Carriers detection empowered federated Learning

    Full text link
    Thalassemia is a group of inherited blood disorders that happen when hemoglobin, the protein in red blood cells that carries oxygen, is not made enough. It is found all over the body and is needed for survival. If both parents have thalassemia, a child's chance of getting it increases. Genetic counselling and early diagnosis are essential for treating thalassemia and stopping it from being passed on to future generations. It may be hard for healthcare professionals to differentiate between people with thalassemia carriers and those without. The current blood tests for beta thalassemia carriers are too expensive, take too long, and require too much screening equipment. The World Health Organization says there is a high death rate for people with thalassemia. Therefore, it is essential to find thalassemia carriers to act quickly. High-performance liquid chromatography (HPLC), the standard test method, has problems such as cost, time, and equipment needs. So, there must be a quick and cheap way to find people carrying the thalassemia gene. Using federated learning (FL) techniques, this study shows a new way to find people with the beta-thalassemia gene. FL allows data to be collected and processed on-site while following privacy rules, making it an excellent choice for sensitive health data. Researchers used FL to train a model for beta-thalassemia carriers by looking at the complete blood count results and red blood cell indices. The model was 92.38 % accurate at telling the difference between beta-thalassemia carriers and people who did not have the disease. The proposed FL model is better than other published methods in terms of how well it works, how reliable it is, and how private it is. This research shows a promising, quick, accurate, and low-cost way to find thalassemia carriers and opens the door for screening them on a large scale.Comment: pages 17, figures

    Blood

    Get PDF
    This book examines both the fluid and cellular components of blood. After the introductory section, the second section presents updates on various topics in hemodynamics. Chapters in this section discuss anemia, 4D flow MRI in cardiology, cardiovascular complications of robot-assisted laparoscopic pelvic surgery, altered perfusion in multiple sclerosis, and hemodynamic laminar shear stress in oxidative homeostasis. The third section focuses on thalassemia with chapters on diagnosis and screening for thalassemia, high blood pressure in beta-thalassemia, and hepatitis C infection in thalassemia patients

    Genetic variability, disease severity and therapeutic response in a cohort of Angolan Sickle Cell Disease patients

    Get PDF
    Tese de mestrado, Ciências Biofarmacêuticas, 2022, Universidade de Lisboa, Faculdade de Farmácia.A Drepanocitose ou Anemia das Células Falciformes (ACF) é uma doença hereditária com transmissão recessiva, causada pela mutação c.20A>T no gene HBB:. Origina uma variante estrutural da hemoglobina adulta normal denominada de hemoglobina falciforme (HbS). Sob condições de hipoxia, a solubilidade da HbS reduz para um quinto, comparativamente com a hemoglobina adulta. Esta situação causa a polimerização da molécula de HbS com consequente falciformação dos eritrócitos, tornando-os menos flexíveis e mais rígidos, e propensos a eventos de vaso-oclusão e hemólise. As principais manifestações clínicas da ACF são altamente heterogéneas e advém dos eventos de vaso-oclusão e hemólise, causando anemia, isquemia e múltiplas crises de dor que requerem hospitalizações frequentes. Os indivíduos homozigóticos para a HbS têm o fenótipo mais severo e têm uma prevalência de aproximadamente 70% entre todos os indivíduos com ACF. Estes pacientes desenvolvem uma anemia grave devido à ausência da produção de hemoglobina adulta por falta de síntese de cadeias beta-globina normais. A ACF afeta mais de 300 000 recém-nascidos mundialmente por ano, particularmente na África Subsariana, onde a incidência é de 75%. Esta doença é bastante negligenciada, contabilizando uma taxa de mortalidade entre 50-90% em crianças não diagnosticadas com menos de cinco anos de idade. Apesar de ser uma doença monogénica, a ACF tem uma heterogeneidade clínica notavelmente elevada na sua expressão fenotípica. Vários fatores demonstraram modular as manifestações clínicas desta doença, nomeadamente moduladores genéticos como a alfa talassémia e os haplótipos da beta-globina, que modificam parâmetros biológicos como o grau de anemia hemolítica ou os níveis de hemoglobina fetal (HbF). Os genes ZBTB7A e ZNF410 foram propostos como possíveis influenciadores da fisiopatologia da ACF. O ZBTB7A é um regulador importante da transição da expressão de HbF para hemoglobina adulta, atuando como um repressor de transcrição aquando da sua ligação ao promotor do gene da gama-globina, responsável pela expressão de HbF. A potência de repressão do ZBTB7A pode originar a perda de 50% da HbF. Apesar da quantidade reduzida de estudos, sabe-se que o ZNF410 atua somente num alvo nas células eritroides, que é responsável por silenciar a gama-globina e, consequentemente, inibir a expressão de HbF. Dado que o ZNF410 e o ZBTB7A atuam no complexo co-repressor que desempenha funções cruciais na troca de hemoglobinas, acredita-se que estes genes representam uma influência considerável na síntese de HbF e, subsequentemente, nos fenótipos da drepanocitose. Posto isto, destaca-se a importância de explorar e estudar novos polimorfismos no ZBTB7A e ZNF410, que possam estar positivamente correlacionados com os níveis de HbF na drepanocitose, severidade da doença e diferenciação de manifestações clínicas. O tratamento primário da ACF são as transfusões de sangue e estima-se que a maioria dos pacientes com SCA tenham recebido pelo menos uma transfusão. Atualmente, existem quatro medicamentos aprovados para o tratamento da ACF: Hidroxiureia, L-Glutamina, Crizanlizumab-tmca e Voxelotor. Tecnologias de edição genética como o CRISPR/Cas9 podem alterar permanentemente os genes causadores de doenças, através da correção, exclusão, adição e interrupção de regiões específicas. Assim, este novo método de tratamento pode providenciar a cura para a maioria dos doentes drepanocíticos. No entanto, questões de segurança necessitam de ser resolvidas para garantir o seu uso clínico seguro. O objetivo deste estudo é determinar e avaliar a frequência e influência de determinados polimorfismos na severidade da doença e na resposta ao tratamento da ACF, numa população de doentes pediátricos e adultos Africanos, acompanhados na consulta da drepanocitose na Clínica Girassol, em Luanda, Angola. Adicionalmente, pretendemos definir a frequência alélica e genotípica das principais variantes genéticas relacionadas com a drepanocitose, nomeadamente o alelo S, haplótipos da beta-globina e variantes do gene BCL11A. Sendo um marcador da ACF, serão analisadas as frequências alélica e genotípica da alfa-talassémia de 3,7kb. Com os dados obtidos, tencionamos realizar estudos de associação entre os marcadores genéticos da drepanocitose e a severidade clínica da doença. Neste estudo pretendemos identificar moduladores genéticos da ACF, através do estudo de 120 pacientes angolanos drepanocíticos acompanhados na consulta de drepanocitose na Clínica Girassol em Luanda, Angola. Realizaram-se estudos de associação entre fenótipos clínicos e parâmetros bioquímicos e hematológicos, bem como 12 variantes genéticas em 7 genes relacionados com a gravidade da doença (HBB, HBBP1, HBA, HBE, HBG1, ZBTB7A e ZNF410). Moduladores genéticos conhecidos da ACF (alfa-talassémia e haplótipos da beta globina) e genes putativos modificadores de parâmetros hematológicos foram caracterizados, e as diferenças na sua distribuição foram avaliadas. Adicionalmente, tencionamos relacionar a resposta à terapia farmacológica com Hidroxiureia, com os marcadores genéticos estudados, no decorrer de uma análise retrospetiva. A presença da deleção alfa-talassémia de 3,7kb demonstrou-se protetora em fatores de severidade da doença como a idade de diagnóstico, idade de manifestação dos primeiros sintomas, número de transfusões de sangue recebidas, grau de anemia e ocorrência de crises dolorosas. Neste estudo, observámos um aumento na prevalência do alelo com a deleção relacionada com o envelhecimento, o que se confirmou por uma frequência de 50% de pacientes heterozigóticos e 16,7% de homozigóticos para a deleção com idade superior a 20 anos. No nosso projeto, verificou-se uma redução da percentagem de indivíduos a receber tratamento com Hidroxiureia diretamente proporcional ao aumento da idade. Além disso, observou-se que a maior percentagem corresponde a pacientes sem deleção. Deste modo, é possível aferir que os pacientes mais velhos e, por consequente, com mais alelos da deleção, não requerem tanto tratamento farmacológico como os indivíduos sem deleção, salientando a componente protetora da alfa-talassémia de 3,7kb em doentes drepanocíticos. O haplótipo CAR/CAR, predominante em indivíduos angolanos, foi o mais prevalente na nossa população. É caracterizado por um fenótipo mais severo, corroborado no nosso estudo por uma diminuição no número de indivíduos com este haplótipo com o aumento da idade, sugerindo uma redução na sobrevivência. Confirmou-se a severidade nos indivíduos CAR/CAR na idade de diagnóstico, necessidade de tratamento com Hidroxiureia, dactilite, crises de dor e AVCs. Contrariamente ao esperado, pacientes CAR/SEN apresentaram maior severidade na idade de manifestação de primeiros sintomas, necessidade de transfusões de sangue e demonstração de sintomas da ACF. Relativamente aos polimorfismos, a frequência de mutações no ZNF410 foi demasiado baixa para tirar elações em análise estatística. Nos polimorfismos do ZBTB7A, verificou-se um aumento na idade a que surgiram os primeiros sintomas em indivíduos com a mutação, bem como redução nas crises de dor e anemia e vantagens hematológicas, designadamente o aumento da concentração de HbF e de hemoglobina A2. Este estudo fornece uma contribuição relevante para o conhecimento genético da população angolana, na qual o haplótipo CAR é indiscutivelmente o mais comum, e a co-herança da alfa-talassémia de 3,7kb tem um grande impacto na severidade clínica da ACF. Neste projeto foram estudados, pela primeira vez, polimorfismos encontrados em genes abordados recentemente e foram realizados estudos de associação entre esses polimorfismos e os parâmetros que caracterizam o fenótipo dos pacientes drepanocíticos. Observaram-se diferenças significativas em vários parâmetros clínicos e em alguns dados hematológicos, em todos os polimorfismos estudados. Ainda há um longo caminho a percorrer antes de entendermos completamente uma doença tão complexa como a drepanocitose e o motivo pelo qual ela se manifesta de diferentes formas nos pacientes. Esta heterogeneidade é, sem dúvida, influenciada pela herança genética dos pacientes, mas esse não é o único fator. Deste modo, são necessários mais estudos para corroborar os resultados obtidos e confirmar as nossas hipóteses acerca do impacto dos polimorfismos na severidade clínica da ACF.Sickle cell disease (SCD) is an inherited disease with recessive transmission caused by the mutation c.20A>T in the HBB gene. It results in a structural variant of normal adult haemoglobin (HbS). The main clinical manifestations of SCD include severe anaemia and multiple pain crises that require regular hospitalizations. Sickle cell anaemia (SCA) designates homozygosity for the HbS mutation and represents 70% of all SCD cases. SCA affects more than 300,000 newborns worldwide per year. This is a widely neglected disease, accounting for a mortality rate between 50-90% in undiagnosed children under five years old. Regardless of being a monogenic disease, SCA has a remarkably high clinical heterogeneity in its phenotypic expression. Several factors modulate the clinical manifestations of SCA, namely genetic markers such as alpha-thalassaemia and beta-globin haplotypes. Moreover, ZBTB7A and ZNF410 have been recently proposed as possible influencers of SCA pathophysiology. This project aims to identify genetic modifiers of the clinical course of SCA by studying 120 Angolan SCA patients followed at Clínica Girassol, in Luanda, Angola. Association studies were performed between the clinical outcomes, and haematological and biochemical parameters of patients, as well as with 12 genetic variants related to disease severity. Known genetic modulators of SCA and putative genetic modifiers of haematological parameters were characterized, and the differences in their distribution were assessed. The presence of 3.7kb α-thalassaemia deletion was found to be protective in disease severity factors, including the degree of anaemia and occurrence of painful crises. The CAR/CAR haplotype was the most prevalent in our population. It is characterized by a worse phenotype with the most severe disease outcomes. However, CAR/SEN patients had a worse prognosis in some clinical and haematological parameters. Individuals presenting variants in ZBTB7A SNPs had a milder phenotype. Notwithstanding, they were strongly associated with higher HbF levels, less symptoms and fewer pain events.Health and Technology Research Center; Clínica Girassol

    Thalassemia Syndrome

    Get PDF

    Sickle cell disease associated lipid changes and their relevance towards the disease pathogenesis And Lipid Biomarkers and Embryo Quality in In Vitro Fertilization; Pregnancy Success Differentially Expressed by Body Weight

    Get PDF
    Abstract Sickle cell disease (SCD) is a group of genetic disorder that occurs due to genetic mutation of a beta-globin gene that lead to production of pathogenic hemoglobin S (HB S). Genotypes of SCD include Hb SS (sickle cell anemia) which is the most common and severe form of SCD affects about 20 to 25 million people worldwide, HbSC, Hb Sβ+- thalassemia, and Hb Sβ0-thalassemia. SCD is characterized by multiorgan complications that, in turn, affect lipids composition. During hypoxia a sequence of changes will take place such as HbS polymerization, erythrocyte rigidity and stickiness, and oxidative stress. The combination of these changes will affect lipids components such as polyunsaturated fatty acids (PUFA), which are substrates for a significant number of bioactive lipids such as the eicosanoids and some of the endocannabinoids. For example, vaso-occlusion crisis, the most common cause of SDC hospitalization, is found to be accompanied by changes in PUFA components of RBCs cell membrane encompassing Omega-3 and Omega-6. This comprehensive review outlines lipid changes that accompany SCD and also identify the gaps in our knowledge. This review will also allow us to devise better treatment options to manage the different pathophysiology and complications of SCD. Abstract: Introduction: A common risk factor for infertility is obesity. The global rise of obesity accompanied with infertility has led to widespread adoption of assisted reproductive technologies such as in vitro fertilization (IVF) to achieve pregnancy. However, pregnancy outcomes such as embryo quality vary after IVF, possibly due to disruptions in metabolism. Previous metabolomic studies investigating embryo quality were limited to characterizing broadly lipid classes, or a few molecular lipid species. Here, we sought to determine specific circulating lipids and metabolites with matrix-specific effects that could serve as putative biomarkers of embryo quality, correlated with BMI, and predicted clinical pregnancy subsequent IVF. Methods: Electronic health record (EHR) data, as well as lipids and metabolites obtained from follicular fluid (FF) and platelet poor plasma (PPP), were collected from women (n = 26) undergoing IVF. Lipids and metabolites were acquired via untargeted mass spectrometry. For embryo quality and BMI, we performed multiple linear regression analysis to find correlates. For 6 weeks pregnancy, we applied a linear discriminant analysis to select lipids and metabolites that allowed for group determination. Results: Several lipids and metabolites were selected from both matrices (FF and PPP) that either outperformed models containing only EHR or added value to EHR models. In predicting embryo quality, glycerophospholipids obtained from PPP produced the best fit model. The predicted values include (LPC) 22:6 , phosphatidylcholine (PC) 16:1/22:6, and phosphatidylethanolamine-plasmalogen (PE-P) 16:0/22:6 were negatively correlated with 2PN while Phosphatidylethanolamine (PE) 18:0/20:3, lysophosphatidylethanolamine (LPE) 18:1, PC 14:0/16:1, and PE-P 16:0/20:5 were positively correlated with 2PN (R adjusted = 0.730, RMSE = 0.329). For rLDA of 6-weeks of pregnancy, the best model was the metabolite model obtained from Platelet Poor Plasma (misclassification = 3.85%, Entropy R-squared = 0.809). The BMI multicomponent domain model obtained from FF, LPC 18:1, PC 16:1/22:6, and malic acid were negatively associated with BMI while Fasting insulin and PC 16:0/22:4 were positively correlated with BMI values (R-square adjusted = 0.819, RMSE = 0.127). However, the combined data model for FF has the best prediction of BMI values. In this model, PE-P 16:0/22:6, aspartic acid, and fasting insulin as positively correlated variables with BMI values, whereas indole-3-propionic acid was negatively correlated with BMI (R-squared adjusted = 0.856, RMSE = 0.113)

    Single gene disorders

    Get PDF

    Dissecting the molecular genetics and pathogenesis of Hereditary Dyserythropoietic Anemias

    Get PDF
    Hereditary anemias (HAs) embrace a heterogeneous group of chronic disorders with a highly variable clinical picture. Within HAs, congenital dyserythropoietic anemias (CDAs) are a large group of hypo-productive anemias that result from various kinds of abnormalities during late stages of erythropoiesis. Among them, CDAI is characterized by relative reticulocytopenia, and congenital anomalies. It is caused by biallelic mutations in CDAN1 and C15orf41. Differential diagnosis, classification, and patient stratification of CDAs and related HAs are often difficult, particularly between CDAI-II and enzymatic defects, such as pyruvate kinase deficiency (PKD). The classical diagnostic workflow for these conditions includes different lines of investigation, in which genetic testing by next generation sequencing (NGS) approaches has become the frontline system. Indeed, the primary aim of this study was to analyze a large cohort of HAs patients (n=244), by our (t)-NGS RedPanel, to identify the proper molecular diagnosis despite their clinical suspicion. Indeed, only 16.3% of patients originally suspected to suffer from CDA (14/86) showed a matched genotype. Conversely, 64% of patients (72/86) initially suspected for CDA were diagnosed as other HAs, mainly PKD. In agreement with this observation, the analysis of the main erythroid markers demonstrated that PKD patients showed a dyserythropoietic component that may underlie the frequent misdiagnosis with CDAI-II. Beyond achieving a definitive diagnosis, knowing the genetic basis of these patients is valuable also for guiding treatment. Indeed, in our cohort of patients, we identified a novel case of syndromic CDA due to a novel variant in CAD gene, leading to a specific treatment with uridine supplementation. Finally, we described three cases of CDAI, identifying two novel variants in the DNA binding domain of C15orf41, Y94S and P20T, and another one in the nuclease domain of the protein, H230P. Functional characterization of these variants showed that the H230P leads to reduced gene expression and protein levels, while Y94S and P20T do not affect C15orf41 expression. Moreover, Y94S and H230P variants accounted for impaired erythroid differentiation in K562 cells, and H230P mutant also exhibits an increased S-phase of the cell cycle. Nowadays, C15orf41 is still an uncharacterized gene, encoding a protein with an unknown function. Thus, we aimed to unravel novel insights on its physiological role. Indeed, we demonstrated that C15orf41 endogenous protein exhibits nuclear and cytosolic localization, being mostly in the nucleus. Our data showed that C15orf41 is a cell-cycle regulated protein, mostly expressed during G1/S phase, and that both the predicted isoforms of the protein are degraded by the ubiquitin-proteasome pathway. Finally, we demonstrated that gene expression of C15orf41 and CDAN1, the other causative gene of CDAI, is tightly correlated, suggesting a shared mechanism of regulation between the two genes. Overall, these studies pointed out the relevance of genetic testing for the achievement of a correct and definitive diagnosis of CDAs and the related HAs, for the treatment of these conditions, and for elucidating the underlying pathogenic mechanisms of such rare disorders

    Multi-omics approaches to sickle cell disease heterogeneity

    Full text link
    La drépanocytose est une maladie causée par une seule mutation dans le gène de la bêta-globine. Les complications liées à la maladie se manifestent sur le plan génétique, épigénique, transcriptionnel, et métabolique. Les approches intégratives des technologies de séquençage à haut-débit permettent de comprendre le mécanisme pathologique et de découvrir des thérapies en lien avec la maladie. Dans cette thèse, j’intègre divers jeux de données omiques et j’applique des méthodes statistiques pour élaborer de nouvelles hypothèses et analyser les données. Dans les deux premières études, je combine les résultats des études d'association pangénomique d'hémoglobine fœtale (HbF) et des globules rouges denses déshydratés (DRBC) avec l'expression génique, l'interaction chromatinienne, les bases de données relatives aux maladies et les cibles médicamenteuses sélectionnées par des experts. Cette approche intégrative a révélé trois nouveaux loci sur le chromosome 10 (BICC1), le chromosome 19 (KLF1) et le chromosome 22 (CECR2) comme régulateurs de l'HbF. Pour l’étude sur la densité de globules rouges, quatre cibles médicamenteuses (BCL6, LRRC32, KNCJ14 et LETM1) ont été identifiées comme des modulateurs potentiels de la sévérité. Dans la troisième étude, j’intégre la métabolomique à la génomique pour établir une relation causale entre la L-glutamine et les crises douleurs en utilisant la randomisation mendélienne. En outre, nous avons identifié 66 biomarqueurs pour 6 complications liées à la drépanocytose et le débit de filtration glomérulaire estimé (DFGe). Enfin, dans la dernière étude j’ai appliqué une approche de clustering aux métabolites que j’ai ensuite combiné aux données de génotype. J’ai découvert des changements métabolomiques mettant en évidence des familles de métabolites impliqués dans les dysfonctionnements rénaux et hépatiques, en plus de confirmer le rôle d'une classe d'acides gras dans la formation en faucille des globules rouges. Ce travail met en évidence l'importance des approches multi-omiques pour découvrir de nouveaux mécanismes biologiques et étudier les maladies humaines.Sickle cell disease is a monogenic disorder caused by a point mutation in the beta-globin gene. The complications related to the disease are characterized by a broad spectrum of distinct genetic, epigenetic, transcriptional, and metabolomic states. Integrative high-throughput technologies approaches to sickle cell disease pathophysiology are crucial to understanding complications mechanisms and uncovering therapeutic interventions. In this thesis, I integrate various omics datasets and apply statistical methods to derive new hypotheses and analyze data. I combine genome-wide association studies results of fetal hemoglobin (HbF) and dehydrated dense red blood cells (DRBC) with gene expression, chromatin interaction, disease-relevant databases, and expert-curated drug targets. This integrative approach revealed three novel loci on chromosome 10 (BICC1), chromosome 19 (KLF1) and chromosome 22 (CECR2) as key modulators of HbF. For DRBC, four drug targets (BCL6, LRRC32, KNCJ14, and LETM1) were identified as potential severity modifiers. Using mendelian randomization, I integrated metabolomics with genomics in the third study to establish a potential causal relationship between L-glutamine and painful crisis. Additionally, we identified 66 biomarkers for 6 SCD-related complications and estimated glomerular filtration rate (eGFR). Finally, the last study applied a clustering framework to metabolites which I then combined with genotypes. I found specific metabolomics changes highlighting families of metabolites involved in renal and liver dysfunction and confirming the role of a class of fatty acids in red blood cell sickling. This work highlights the importance of multi-omics approaches to unearth new biology and study human diseases

    Cardiovascular magnetic resonance of the right ventricle

    Get PDF
    Introduction: Whilst most of the attention has been devoted to the left ventricle in cardiovascular disease, the right ventricle has been somewhat neglected. In the last decades, there has been a renewal of interest in the right ventricle, in part driven by advances in cardiovascular imaging. Methods: Cardiovascular magnetic resonance is arguably the best imaging modality for the study of the right ventricle. In this research thesis, cardiovascular magnetic resonance was used as the primary research tool to assess the right ventricle in different conditions and settings. Results: This thesis encompasses five studies that have been published as peer - reviewed articles. The results of these studies were the following: 1)Right ventricular dilatation and dysfunction was found in a group of patients with Marfan syndrome, further supporting the existence of a Marfan - related cardiomyopathy; 2) In thalassaemia major, right ventricular volumes and ejection fraction differed from healthy controls, and new reference ranges based on patients without iron overload were derived; 3) Myocardial iron loading in thalassaemia major was associated with progressive right ventricular dysfunction; 4) Right ventricular dysfunction due to myocardial siderosis was reversible with effective iron chelation therapy, and; 5) In advanced heart failure, right ventricular function was a predictor of response and outcomes in patients undergoing cardiac resynchronization therapy. Conclusion: The right ventricle is an essential component of the circulatory system, and should be more widely evaluated in patients with cardiopulmonary disease

    Enhancing genetic discoveries with population-specific reference panels

    Get PDF
    Met een aanpak die bekend staat als Genoom-breed associatieonderzoek (Genome-wide association study, GWAS) brak rond tien jaar geleden een nieuw tijdperk in genetica-onderzoek, waarbij licht werd geworpen op de complexe onderliggende factoren en aandoeningen van genetische componenten die voorheen grotendeels onbekend waren. Statistisch afgeleide methoden waren belangrijke ingrediënten voor succes, waarmee onderzoekers externe gegevens aan hun onderzoeken konden toevoegen en informatie konden maximaliseren zonder extra onderzoeksuitgaven. De technologie bleef zich ontwikkelen: terwijl initieel <1 miljoen punten van het DNA (genetische varianten) toegankelijk waren in een persoon, kan tegenwoordig het gehele genoom worden gekarakteriseerd (3 miljard punten) met next-generation sequentiemachines. De kosten voor sequentie zijn nog steeds onpraktisch voor GWAS, omdat er duizenden personen nodig zijn om reproduceerbare bevindingen te verzekeren. Volledige genomen kunnen echter worden afgeleid met statistische methoden, mits een gereduceerd aantal genetische varianten wordt gekarakteriseerd bij de onderzoeksvrijwilligers en een referentieset van onafhankelijke genomen beschikbaar is. Een internationale inspanning, het 1000 Genomes Project, genereerde openbare referentiesets door sequentie van ~2.500 vertegenwoordigers van de wereldpopulaties. In deze thesis evalueerden we de voordelen van een populatiespecifieke referentieset voor Sardijnen door 2.120 vrijwilligers te sequentiëren en deze vervolgens in GWAS te verwerken. We tonen aan hoe de nauwkeurigheid van afgeleide genomen verbeterd is in vergelijking met het gebruik van de 1000 Genomes-set en we identificeerden nieuwe genetische componenten voor verschillende complexe factoren die anders niet ontdekt hadden kunnen worden. Vergelijkbare inspanningen zijn gaande in andere populaties, waaronder de Nederlanders, en we bespreken in deze thesis het ontwerp en de resultaten daarvan.An approach known as Genome-wide association study (GWAS) have signed a new era in the Genetics research field around ten years ago, shedding light on the genetic components underlying complex traits and diseases, previously largely unknown. Statistical inferential methods were key ingredients for success, allowing researchers to incorporate external data in their studies, hence maximizing information at no additional experimental cost. Technology has continued to improve, and while initially <1 million points of the DNA (genetic variants) were assessable in a person, nowadays the entire genome (3 billion points) can be characterized with next-generation sequencing machines. The cost of sequencing is still impractical for GWASs, because several thousands of individuals are needed to assure reproducible findings. With statistical methods however, full genomes can be inferred if a reduced number of genetic variants is characterized on the study’s volunteers and a reference set of independent genomes is available. An international effort, the 1000 Genomes Project, has generated public reference sets by sequencing ~2500 representatives of the world’s populations. In this thesis, we evaluated the benefits of a population-specific reference set for Sardinians by sequencing 2,120 volunteers and subsequently incorporate it in GWASs. We show how the accuracy of inferred genomes is improved compared to using the 1000 Genomes set, and we identified novel genetic components for several complex traits that could not have been discovered otherwise. Similar efforts are ongoing in other populations, including the Dutch, and we discuss in this thesis their design and results
    corecore