557 research outputs found
Beta Thalassemia Carriers detection empowered federated Learning
Thalassemia is a group of inherited blood disorders that happen when
hemoglobin, the protein in red blood cells that carries oxygen, is not made
enough. It is found all over the body and is needed for survival. If both
parents have thalassemia, a child's chance of getting it increases. Genetic
counselling and early diagnosis are essential for treating thalassemia and
stopping it from being passed on to future generations. It may be hard for
healthcare professionals to differentiate between people with thalassemia
carriers and those without. The current blood tests for beta thalassemia
carriers are too expensive, take too long, and require too much screening
equipment. The World Health Organization says there is a high death rate for
people with thalassemia. Therefore, it is essential to find thalassemia
carriers to act quickly. High-performance liquid chromatography (HPLC), the
standard test method, has problems such as cost, time, and equipment needs. So,
there must be a quick and cheap way to find people carrying the thalassemia
gene. Using federated learning (FL) techniques, this study shows a new way to
find people with the beta-thalassemia gene. FL allows data to be collected and
processed on-site while following privacy rules, making it an excellent choice
for sensitive health data. Researchers used FL to train a model for
beta-thalassemia carriers by looking at the complete blood count results and
red blood cell indices. The model was 92.38 % accurate at telling the
difference between beta-thalassemia carriers and people who did not have the
disease. The proposed FL model is better than other published methods in terms
of how well it works, how reliable it is, and how private it is. This research
shows a promising, quick, accurate, and low-cost way to find thalassemia
carriers and opens the door for screening them on a large scale.Comment: pages 17, figures
Blood
This book examines both the fluid and cellular components of blood. After the introductory section, the second section presents updates on various topics in hemodynamics. Chapters in this section discuss anemia, 4D flow MRI in cardiology, cardiovascular complications of robot-assisted laparoscopic pelvic surgery, altered perfusion in multiple sclerosis, and hemodynamic laminar shear stress in oxidative homeostasis. The third section focuses on thalassemia with chapters on diagnosis and screening for thalassemia, high blood pressure in beta-thalassemia, and hepatitis C infection in thalassemia patients
Genetic variability, disease severity and therapeutic response in a cohort of Angolan Sickle Cell Disease patients
Tese de mestrado, Ciências Biofarmacêuticas, 2022, Universidade de Lisboa, Faculdade de Farmácia.A Drepanocitose ou Anemia das Células Falciformes (ACF) é uma doença hereditária com
transmissão recessiva, causada pela mutação c.20A>T no gene HBB:. Origina uma variante
estrutural da hemoglobina adulta normal denominada de hemoglobina falciforme (HbS). Sob
condições de hipoxia, a solubilidade da HbS reduz para um quinto, comparativamente com a
hemoglobina adulta. Esta situação causa a polimerização da molécula de HbS com
consequente falciformação dos eritrócitos, tornando-os menos flexíveis e mais rígidos, e
propensos a eventos de vaso-oclusão e hemólise. As principais manifestações clínicas da ACF
são altamente heterogéneas e advém dos eventos de vaso-oclusão e hemólise, causando
anemia, isquemia e múltiplas crises de dor que requerem hospitalizações frequentes.
Os indivíduos homozigóticos para a HbS têm o fenótipo mais severo e têm uma prevalência
de aproximadamente 70% entre todos os indivíduos com ACF. Estes pacientes desenvolvem
uma anemia grave devido à ausência da produção de hemoglobina adulta por falta de síntese
de cadeias beta-globina normais. A ACF afeta mais de 300 000 recém-nascidos mundialmente
por ano, particularmente na África Subsariana, onde a incidência é de 75%. Esta doença é
bastante negligenciada, contabilizando uma taxa de mortalidade entre 50-90% em crianças
não diagnosticadas com menos de cinco anos de idade.
Apesar de ser uma doença monogénica, a ACF tem uma heterogeneidade clínica
notavelmente elevada na sua expressão fenotípica. Vários fatores demonstraram modular as
manifestações clínicas desta doença, nomeadamente moduladores genéticos como a alfa talassémia e os haplótipos da beta-globina, que modificam parâmetros biológicos como o grau
de anemia hemolítica ou os níveis de hemoglobina fetal (HbF). Os genes ZBTB7A e ZNF410
foram propostos como possíveis influenciadores da fisiopatologia da ACF.
O ZBTB7A é um regulador importante da transição da expressão de HbF para hemoglobina
adulta, atuando como um repressor de transcrição aquando da sua ligação ao promotor do
gene da gama-globina, responsável pela expressão de HbF. A potência de repressão do
ZBTB7A pode originar a perda de 50% da HbF.
Apesar da quantidade reduzida de estudos, sabe-se que o ZNF410 atua somente num alvo nas
células eritroides, que é responsável por silenciar a gama-globina e, consequentemente, inibir
a expressão de HbF. Dado que o ZNF410 e o ZBTB7A atuam no complexo co-repressor que
desempenha funções cruciais na troca de hemoglobinas, acredita-se que estes genes
representam uma influência considerável na síntese de HbF e, subsequentemente, nos
fenótipos da drepanocitose. Posto isto, destaca-se a importância de explorar e estudar novos
polimorfismos no ZBTB7A e ZNF410, que possam estar positivamente correlacionados com
os níveis de HbF na drepanocitose, severidade da doença e diferenciação de manifestações
clínicas.
O tratamento primário da ACF são as transfusões de sangue e estima-se que a maioria dos
pacientes com SCA tenham recebido pelo menos uma transfusão. Atualmente, existem quatro
medicamentos aprovados para o tratamento da ACF: Hidroxiureia, L-Glutamina,
Crizanlizumab-tmca e Voxelotor.
Tecnologias de edição genética como o CRISPR/Cas9 podem alterar permanentemente os
genes causadores de doenças, através da correção, exclusão, adição e interrupção de regiões específicas. Assim, este novo método de tratamento pode providenciar a cura para a maioria
dos doentes drepanocíticos. No entanto, questões de segurança necessitam de ser resolvidas
para garantir o seu uso clínico seguro.
O objetivo deste estudo é determinar e avaliar a frequência e influência de determinados
polimorfismos na severidade da doença e na resposta ao tratamento da ACF, numa população
de doentes pediátricos e adultos Africanos, acompanhados na consulta da drepanocitose na
Clínica Girassol, em Luanda, Angola. Adicionalmente, pretendemos definir a frequência
alélica e genotípica das principais variantes genéticas relacionadas com a drepanocitose,
nomeadamente o alelo S, haplótipos da beta-globina e variantes do gene BCL11A. Sendo um
marcador da ACF, serão analisadas as frequências alélica e genotípica da alfa-talassémia de
3,7kb. Com os dados obtidos, tencionamos realizar estudos de associação entre os marcadores
genéticos da drepanocitose e a severidade clínica da doença.
Neste estudo pretendemos identificar moduladores genéticos da ACF, através do estudo de
120 pacientes angolanos drepanocíticos acompanhados na consulta de drepanocitose na
Clínica Girassol em Luanda, Angola. Realizaram-se estudos de associação entre fenótipos
clínicos e parâmetros bioquímicos e hematológicos, bem como 12 variantes genéticas em 7
genes relacionados com a gravidade da doença (HBB, HBBP1, HBA, HBE, HBG1, ZBTB7A e
ZNF410). Moduladores genéticos conhecidos da ACF (alfa-talassémia e haplótipos da beta globina) e genes putativos modificadores de parâmetros hematológicos foram caracterizados,
e as diferenças na sua distribuição foram avaliadas. Adicionalmente, tencionamos relacionar a
resposta à terapia farmacológica com Hidroxiureia, com os marcadores genéticos estudados,
no decorrer de uma análise retrospetiva.
A presença da deleção alfa-talassémia de 3,7kb demonstrou-se protetora em fatores de
severidade da doença como a idade de diagnóstico, idade de manifestação dos primeiros
sintomas, número de transfusões de sangue recebidas, grau de anemia e ocorrência de crises
dolorosas. Neste estudo, observámos um aumento na prevalência do alelo com a deleção
relacionada com o envelhecimento, o que se confirmou por uma frequência de 50% de
pacientes heterozigóticos e 16,7% de homozigóticos para a deleção com idade superior a 20
anos.
No nosso projeto, verificou-se uma redução da percentagem de indivíduos a receber
tratamento com Hidroxiureia diretamente proporcional ao aumento da idade. Além disso,
observou-se que a maior percentagem corresponde a pacientes sem deleção. Deste modo, é
possível aferir que os pacientes mais velhos e, por consequente, com mais alelos da deleção,
não requerem tanto tratamento farmacológico como os indivíduos sem deleção, salientando a
componente protetora da alfa-talassémia de 3,7kb em doentes drepanocíticos.
O haplótipo CAR/CAR, predominante em indivíduos angolanos, foi o mais prevalente na
nossa população. É caracterizado por um fenótipo mais severo, corroborado no nosso estudo
por uma diminuição no número de indivíduos com este haplótipo com o aumento da idade,
sugerindo uma redução na sobrevivência. Confirmou-se a severidade nos indivíduos
CAR/CAR na idade de diagnóstico, necessidade de tratamento com Hidroxiureia, dactilite,
crises de dor e AVCs. Contrariamente ao esperado, pacientes CAR/SEN apresentaram maior
severidade na idade de manifestação de primeiros sintomas, necessidade de transfusões de
sangue e demonstração de sintomas da ACF.
Relativamente aos polimorfismos, a frequência de mutações no ZNF410 foi demasiado baixa
para tirar elações em análise estatística. Nos polimorfismos do ZBTB7A, verificou-se um
aumento na idade a que surgiram os primeiros sintomas em indivíduos com a mutação, bem
como redução nas crises de dor e anemia e vantagens hematológicas, designadamente o
aumento da concentração de HbF e de hemoglobina A2.
Este estudo fornece uma contribuição relevante para o conhecimento genético da população
angolana, na qual o haplótipo CAR é indiscutivelmente o mais comum, e a co-herança da
alfa-talassémia de 3,7kb tem um grande impacto na severidade clínica da ACF. Neste projeto
foram estudados, pela primeira vez, polimorfismos encontrados em genes abordados
recentemente e foram realizados estudos de associação entre esses polimorfismos e os
parâmetros que caracterizam o fenótipo dos pacientes drepanocíticos. Observaram-se
diferenças significativas em vários parâmetros clínicos e em alguns dados hematológicos, em
todos os polimorfismos estudados.
Ainda há um longo caminho a percorrer antes de entendermos completamente uma doença tão
complexa como a drepanocitose e o motivo pelo qual ela se manifesta de diferentes formas
nos pacientes. Esta heterogeneidade é, sem dúvida, influenciada pela herança genética dos
pacientes, mas esse não é o único fator. Deste modo, são necessários mais estudos para
corroborar os resultados obtidos e confirmar as nossas hipóteses acerca do impacto dos
polimorfismos na severidade clínica da ACF.Sickle cell disease (SCD) is an inherited disease with recessive transmission caused by the
mutation c.20A>T in the HBB gene. It results in a structural variant of normal adult
haemoglobin (HbS). The main clinical manifestations of SCD include severe anaemia and
multiple pain crises that require regular hospitalizations.
Sickle cell anaemia (SCA) designates homozygosity for the HbS mutation and represents 70%
of all SCD cases. SCA affects more than 300,000 newborns worldwide per year. This is a
widely neglected disease, accounting for a mortality rate between 50-90% in undiagnosed
children under five years old.
Regardless of being a monogenic disease, SCA has a remarkably high clinical heterogeneity
in its phenotypic expression. Several factors modulate the clinical manifestations of SCA,
namely genetic markers such as alpha-thalassaemia and beta-globin haplotypes. Moreover,
ZBTB7A and ZNF410 have been recently proposed as possible influencers of SCA
pathophysiology.
This project aims to identify genetic modifiers of the clinical course of SCA by studying 120
Angolan SCA patients followed at Clínica Girassol, in Luanda, Angola. Association studies
were performed between the clinical outcomes, and haematological and biochemical
parameters of patients, as well as with 12 genetic variants related to disease severity. Known
genetic modulators of SCA and putative genetic modifiers of haematological parameters were
characterized, and the differences in their distribution were assessed.
The presence of 3.7kb α-thalassaemia deletion was found to be protective in disease severity
factors, including the degree of anaemia and occurrence of painful crises. The CAR/CAR
haplotype was the most prevalent in our population. It is characterized by a worse phenotype
with the most severe disease outcomes. However, CAR/SEN patients had a worse prognosis
in some clinical and haematological parameters. Individuals presenting variants in ZBTB7A
SNPs had a milder phenotype. Notwithstanding, they were strongly associated with higher
HbF levels, less symptoms and fewer pain events.Health and Technology Research Center; Clínica Girassol
Sickle cell disease associated lipid changes and their relevance towards the disease pathogenesis And Lipid Biomarkers and Embryo Quality in In Vitro Fertilization; Pregnancy Success Differentially Expressed by Body Weight
Abstract
Sickle cell disease (SCD) is a group of genetic disorder that occurs due to genetic mutation of a beta-globin gene that lead to production of pathogenic hemoglobin S (HB S). Genotypes of SCD include Hb SS (sickle cell anemia) which is the most common and severe form of SCD affects about 20 to 25 million people worldwide, HbSC, Hb Sβ+- thalassemia, and Hb Sβ0-thalassemia. SCD is characterized by multiorgan complications that, in turn, affect lipids composition. During hypoxia a sequence of changes will take place such as HbS polymerization, erythrocyte rigidity and stickiness, and oxidative stress. The combination of these changes will affect lipids components such as polyunsaturated fatty acids (PUFA), which are substrates for a significant number of bioactive lipids such as the eicosanoids and some of the endocannabinoids. For example, vaso-occlusion crisis, the most common cause of SDC hospitalization, is found to be accompanied by changes in PUFA components of RBCs cell membrane encompassing Omega-3 and Omega-6. This comprehensive review outlines lipid changes that accompany SCD and also identify the gaps in our knowledge. This review will also allow us to devise better treatment options to manage the different pathophysiology and complications of SCD. Abstract:
Introduction: A common risk factor for infertility is obesity. The global rise of obesity accompanied with infertility has led to widespread adoption of assisted reproductive technologies such as in vitro fertilization (IVF) to achieve pregnancy. However, pregnancy outcomes such as embryo quality vary after IVF, possibly due to disruptions in metabolism. Previous metabolomic studies investigating embryo quality were limited to characterizing broadly lipid classes, or a few molecular lipid species. Here, we sought to determine specific circulating lipids and metabolites with matrix-specific effects that could serve as putative biomarkers of embryo quality, correlated with BMI, and predicted clinical pregnancy subsequent IVF.
Methods: Electronic health record (EHR) data, as well as lipids and metabolites obtained from follicular fluid (FF) and platelet poor plasma (PPP), were collected from women (n = 26) undergoing IVF. Lipids and metabolites were acquired via untargeted mass spectrometry. For embryo quality and BMI, we performed multiple linear regression analysis to find correlates. For 6 weeks pregnancy, we applied a linear discriminant analysis to select lipids and metabolites that allowed for group determination.
Results: Several lipids and metabolites were selected from both matrices (FF and PPP) that either outperformed models containing only EHR or added value to EHR models. In predicting embryo quality, glycerophospholipids obtained from PPP produced the best fit model. The predicted values include (LPC) 22:6 , phosphatidylcholine (PC) 16:1/22:6, and phosphatidylethanolamine-plasmalogen (PE-P) 16:0/22:6 were negatively correlated with 2PN while Phosphatidylethanolamine (PE) 18:0/20:3, lysophosphatidylethanolamine (LPE) 18:1, PC 14:0/16:1, and PE-P 16:0/20:5 were positively correlated with 2PN (R adjusted = 0.730, RMSE = 0.329). For rLDA of 6-weeks of pregnancy, the best model was the metabolite model obtained from Platelet Poor Plasma (misclassification = 3.85%, Entropy R-squared = 0.809).
The BMI multicomponent domain model obtained from FF, LPC 18:1, PC 16:1/22:6, and malic acid were negatively associated with BMI while Fasting insulin and PC 16:0/22:4 were positively correlated with BMI values (R-square adjusted = 0.819, RMSE = 0.127). However, the combined data model for FF has the best prediction of BMI values. In this model, PE-P 16:0/22:6, aspartic acid, and fasting insulin as positively correlated variables with BMI values, whereas indole-3-propionic acid was negatively correlated with BMI (R-squared adjusted = 0.856, RMSE = 0.113)
Dissecting the molecular genetics and pathogenesis of Hereditary Dyserythropoietic Anemias
Hereditary anemias (HAs) embrace a heterogeneous group of chronic disorders with a highly variable clinical picture. Within HAs, congenital dyserythropoietic anemias (CDAs) are a large group of hypo-productive anemias that result from various kinds of abnormalities during late stages of erythropoiesis. Among them, CDAI is characterized by relative reticulocytopenia, and congenital anomalies. It is caused by biallelic mutations in CDAN1 and C15orf41. Differential diagnosis, classification, and patient stratification of CDAs and related HAs are often difficult, particularly between CDAI-II and enzymatic defects, such as pyruvate kinase deficiency (PKD). The classical diagnostic workflow for these conditions includes different lines of investigation, in which genetic testing by next generation sequencing (NGS) approaches has become the frontline system. Indeed, the primary aim of this study was to analyze a large cohort of HAs patients (n=244), by our (t)-NGS RedPanel, to identify the proper molecular diagnosis despite their clinical suspicion. Indeed, only 16.3% of patients originally suspected to suffer from CDA (14/86) showed a matched genotype. Conversely, 64% of patients (72/86) initially suspected for CDA were diagnosed as other HAs, mainly PKD. In agreement with this observation, the analysis of the main erythroid markers demonstrated that PKD patients showed a dyserythropoietic component that may underlie the frequent misdiagnosis with CDAI-II.
Beyond achieving a definitive diagnosis, knowing the genetic basis of these patients is valuable also for guiding treatment. Indeed, in our cohort of patients, we identified a novel case of syndromic CDA due to a novel variant in CAD gene, leading to a specific treatment with uridine supplementation. Finally, we described three cases of CDAI, identifying two novel variants in the DNA binding domain of C15orf41, Y94S and P20T, and another one in the nuclease domain of the protein, H230P. Functional characterization of these variants
showed that the H230P leads to reduced gene expression and protein levels, while Y94S and P20T do not affect C15orf41 expression. Moreover, Y94S and H230P variants accounted for impaired erythroid differentiation in K562 cells, and H230P mutant also exhibits an increased S-phase of the cell cycle. Nowadays, C15orf41 is still an uncharacterized gene, encoding a protein with an unknown function. Thus, we aimed to unravel novel insights on its physiological role. Indeed, we demonstrated that C15orf41 endogenous protein exhibits nuclear and cytosolic localization, being mostly in the nucleus. Our data showed that C15orf41 is a cell-cycle regulated protein, mostly expressed during G1/S phase, and that both the predicted isoforms of the protein are degraded by the ubiquitin-proteasome pathway. Finally, we demonstrated that gene expression of C15orf41 and CDAN1, the other causative gene of CDAI, is tightly correlated, suggesting a shared mechanism of regulation between the two genes.
Overall, these studies pointed out the relevance of genetic testing for the achievement of a correct and definitive diagnosis of CDAs and the related HAs, for the treatment of these conditions, and for elucidating the underlying pathogenic mechanisms of such rare disorders
Multi-omics approaches to sickle cell disease heterogeneity
La drépanocytose est une maladie causée par une seule mutation dans le gène de la bêta-globine. Les complications liées à la maladie se manifestent sur le plan génétique, épigénique, transcriptionnel, et métabolique. Les approches intégratives des technologies de séquençage à haut-débit permettent de comprendre le mécanisme pathologique et de découvrir des thérapies en lien avec la maladie. Dans cette thèse, j’intègre divers jeux de données omiques et j’applique des méthodes statistiques pour élaborer de nouvelles hypothèses et analyser les données.
Dans les deux premières études, je combine les résultats des études d'association pangénomique d'hémoglobine fœtale (HbF) et des globules rouges denses déshydratés (DRBC) avec l'expression génique, l'interaction chromatinienne, les bases de données relatives aux maladies et les cibles médicamenteuses sélectionnées par des experts. Cette approche intégrative a révélé trois nouveaux loci sur le chromosome 10 (BICC1), le chromosome 19 (KLF1) et le chromosome 22 (CECR2) comme régulateurs de l'HbF. Pour l’étude sur la densité de globules rouges, quatre cibles médicamenteuses (BCL6, LRRC32, KNCJ14 et LETM1) ont été identifiées comme des modulateurs potentiels de la sévérité.
Dans la troisième étude, j’intégre la métabolomique à la génomique pour établir une relation causale entre la L-glutamine et les crises douleurs en utilisant la randomisation mendélienne. En outre, nous avons identifié 66 biomarqueurs pour 6 complications liées à la drépanocytose et le débit de filtration glomérulaire estimé (DFGe). Enfin, dans la dernière étude j’ai appliqué une approche de clustering aux métabolites que j’ai ensuite combiné aux données de génotype. J’ai découvert des changements métabolomiques mettant en évidence des familles de métabolites impliqués dans les dysfonctionnements rénaux et hépatiques, en plus de confirmer le rôle d'une classe d'acides gras dans la formation en faucille des globules rouges. Ce travail met en évidence l'importance des approches multi-omiques pour découvrir de nouveaux mécanismes biologiques et étudier les maladies humaines.Sickle cell disease is a monogenic disorder caused by a point mutation in the beta-globin gene. The complications related to the disease are characterized by a broad spectrum of distinct genetic, epigenetic, transcriptional, and metabolomic states. Integrative high-throughput technologies approaches to sickle cell disease pathophysiology are crucial to understanding complications mechanisms and uncovering therapeutic interventions. In this thesis, I integrate various omics datasets and apply statistical methods to derive new hypotheses and analyze data.
I combine genome-wide association studies results of fetal hemoglobin (HbF) and dehydrated dense red blood cells (DRBC) with gene expression, chromatin interaction, disease-relevant databases, and expert-curated drug targets. This integrative approach revealed three novel loci on chromosome 10 (BICC1), chromosome 19 (KLF1) and chromosome 22 (CECR2) as key modulators of HbF. For DRBC, four drug targets (BCL6, LRRC32, KNCJ14, and LETM1) were identified as potential severity modifiers.
Using mendelian randomization, I integrated metabolomics with genomics in the third study to establish a potential causal relationship between L-glutamine and painful crisis. Additionally, we identified 66 biomarkers for 6 SCD-related complications and estimated glomerular filtration rate (eGFR). Finally, the last study applied a clustering framework to metabolites which I then combined with genotypes. I found specific metabolomics changes highlighting families of metabolites involved in renal and liver dysfunction and confirming the role of a class of fatty acids in red blood cell sickling. This work highlights the importance of multi-omics approaches to unearth new biology and study human diseases
Cardiovascular magnetic resonance of the right ventricle
Introduction:
Whilst most of the attention has been devoted to the left ventricle
in cardiovascular disease, the right ventricle has been somewhat neglected. In the last decades, there has been a renewal of
interest in the right ventricle, in part driven by advances in
cardiovascular imaging.
Methods:
Cardiovascular magnetic resonance is arguably the best imaging modality for the study of the right ventricle.
In this research thesis, cardiovascular magnetic resonance
was used as the primary research tool to assess the right ventricle
in different conditions and settings.
Results:
This thesis encompasses five studies that have been published as
peer - reviewed articles. The results of these studies were the following: 1)Right ventricular dilatation and dysfunction was
found in a group of patients with Marfan syndrome, further supporting the existence of a Marfan - related cardiomyopathy; 2)
In thalassaemia major, right ventricular volumes and ejection fraction differed from healthy controls, and new reference ranges based on patients without iron overload were derived;
3) Myocardial iron loading in thalassaemia major was associated with progressive right ventricular dysfunction; 4) Right ventricular
dysfunction due to myocardial siderosis was reversible with effective iron chelation therapy, and; 5) In advanced heart failure,
right ventricular function was a predictor of response and
outcomes in patients undergoing cardiac resynchronization therapy.
Conclusion:
The right ventricle is an essential component of the circulatory system, and should be more widely evaluated in patients
with cardiopulmonary disease
Enhancing genetic discoveries with population-specific reference panels
Met een aanpak die bekend staat als Genoom-breed associatieonderzoek (Genome-wide association study, GWAS) brak rond tien jaar geleden een nieuw tijdperk in genetica-onderzoek, waarbij licht werd geworpen op de complexe onderliggende factoren en aandoeningen van genetische componenten die voorheen grotendeels onbekend waren. Statistisch afgeleide methoden waren belangrijke ingrediënten voor succes, waarmee onderzoekers externe gegevens aan hun onderzoeken konden toevoegen en informatie konden maximaliseren zonder extra onderzoeksuitgaven. De technologie bleef zich ontwikkelen: terwijl initieel <1 miljoen punten van het DNA (genetische varianten) toegankelijk waren in een persoon, kan tegenwoordig het gehele genoom worden gekarakteriseerd (3 miljard punten) met next-generation sequentiemachines. De kosten voor sequentie zijn nog steeds onpraktisch voor GWAS, omdat er duizenden personen nodig zijn om reproduceerbare bevindingen te verzekeren. Volledige genomen kunnen echter worden afgeleid met statistische methoden, mits een gereduceerd aantal genetische varianten wordt gekarakteriseerd bij de onderzoeksvrijwilligers en een referentieset van onafhankelijke genomen beschikbaar is. Een internationale inspanning, het 1000 Genomes Project, genereerde openbare referentiesets door sequentie van ~2.500 vertegenwoordigers van de wereldpopulaties. In deze thesis evalueerden we de voordelen van een populatiespecifieke referentieset voor Sardijnen door 2.120 vrijwilligers te sequentiëren en deze vervolgens in GWAS te verwerken. We tonen aan hoe de nauwkeurigheid van afgeleide genomen verbeterd is in vergelijking met het gebruik van de 1000 Genomes-set en we identificeerden nieuwe genetische componenten voor verschillende complexe factoren die anders niet ontdekt hadden kunnen worden. Vergelijkbare inspanningen zijn gaande in andere populaties, waaronder de Nederlanders, en we bespreken in deze thesis het ontwerp en de resultaten daarvan.An approach known as Genome-wide association study (GWAS) have signed a new era in the Genetics research field around ten years ago, shedding light on the genetic components underlying complex traits and diseases, previously largely unknown. Statistical inferential methods were key ingredients for success, allowing researchers to incorporate external data in their studies, hence maximizing information at no additional experimental cost. Technology has continued to improve, and while initially <1 million points of the DNA (genetic variants) were assessable in a person, nowadays the entire genome (3 billion points) can be characterized with next-generation sequencing machines. The cost of sequencing is still impractical for GWASs, because several thousands of individuals are needed to assure reproducible findings. With statistical methods however, full genomes can be inferred if a reduced number of genetic variants is characterized on the study’s volunteers and a reference set of independent genomes is available. An international effort, the 1000 Genomes Project, has generated public reference sets by sequencing ~2500 representatives of the world’s populations. In this thesis, we evaluated the benefits of a population-specific reference set for Sardinians by sequencing 2,120 volunteers and subsequently incorporate it in GWASs. We show how the accuracy of inferred genomes is improved compared to using the 1000 Genomes set, and we identified novel genetic components for several complex traits that could not have been discovered otherwise. Similar efforts are ongoing in other populations, including the Dutch, and we discuss in this thesis their design and results
- …