375 research outputs found
Recommended from our members
The Monogenic Architecture of Retinal and Neurological Diseases
Monogenic diseases, or single-gene disorders, are clinical manifestations that can be traced to genetic variation in a single gene that alters the biologically intended (wildtype) function of its protein (or mRNA) product. Although the causal gene and its function are well-understood in many monogenic diseases, this knowledge alone often does not fully encapsulate the extensive clinical spectrum of phenotypes seen in patients. This is due in part to the numerous types of pathogenic variants that can arise in a single gene, all of which can have distinct effects on disease expression. Understanding the relationship between the vast number of possible genotypes and corresponding disease phenotypes defines a gene’s monogenic disease architecture—an important but poorly understood concept that can yield informative mechanistic and clinical insight.
This doctoral dissertation integrates traditional sequencing approaches with in-depth characterization of patient phenotypes to elucidate the monogenic disease architecture of three etiologically distinct disorders: retinal degeneration caused by autosomal recessive variation in ABCA4 and neurodevelopmental disease entities caused by autosomal dominant variants in CERT1 and PUM1. Genetic modifiers are identified as a significant factor in the penetrance of the major disease-causing allele of ABCA4 and several other genetic inconsistencies are resolved to create a coherent genotype-phenotype model for the disease. Insight from this model is then applied to demonstrate the effect of allele differences in disease progression and evaluation of treatment efficacy in patients. A large cohort of affected individuals with CERT1 variation is assembled to (1) validate the causal role of CERT1 in disease, (2) delineate the precise mechanism of CERT protein dysfunction in sphingolipid metabolism and (3) demonstrate therapeutic efficacy of an inhibitor compound for a newly described syndrome.
Finally, the mutational spectrum of PUM1 is expanded to previously unattributed variant classes with unexpected pathophysiological consequences to patients. Not only do the findings in this dissertation advance the prospects of delivering personalized, precision medicine to patients, the overall impact underscores the importance of this integrated approach in reconciling knowledge gaps between observations at the molecular and organismal level
A Systems Genetics Approach to Drosophila melanogaster Models of Rare and Common Neurodevelopmental Disorders
Fetal Alcohol Spectrum Disorders are a group of disorders resulting from prenatal alcohol exposure, presenting with neurodevelopmental and facial abnormalities of varying severity. SSRIDDs and CdLS are rare disorders of chromatin modification, resulting in patients with a wide range of craniofacial, digit and/or neurodevelopmental abnormalities. All of these disorders have a wide range of clinical phenotypes and disease severity, yet the role of potential genetic modifiers and gene-gene or gene-environment interactions in disease pathogenesis is largely unknown and cannot be studied in humans. Insufficient numbers of patients with a single rare disorder prevent investigation of genetic factors beyond the focal disease-associated variant, while experimental study of the more common FASD using human subjects is prohibited due to ethical constraints. Drosophila melanogaster is an excellent model system for neurodevelopmental disorders, as Drosophila neurobiology is largely conserved in humans and experiments performed in Drosophila are low-cost, easily controlled, and exempt from regulation. Here, we take advantage of the Drosophila model system and identify genetic factors contributing to these neurodevelopmental disorders. Specifically, we used the Drosophila Genetic Reference Panel (DGRP) of inbred lines with full genome sequences and single cell RNA sequencing to identify genetic networks in adult Drosophila after developmental ethanol exposure and demonstrate that changes in sleep, activity, and time to sedation as a result of the developmental ethanol exposure are dependent on genetic background. We also developed a novel assay measuring time to ethanol-induced sedation of individual flies to better assess this phenotype in our research and characterized a previously unstudied long noncoding RNA critical for Drosophila fitness and stress-response. We then established Drosophila models for multiple SSRIDD and CdLS subtypes and determined the extent to which behavioral and transcriptomic phenotypes vary within and across these rare disorders. Finally, we used SSRIDD Drosophila models to present evidence for the role of genetic modifiers in ARID1B-associated SSRIDD and identify candidate genetic modifiers for multiple SSRIDD subtypes. Taken together, these results show that the Drosophila model system is a powerful tool for investigating the genetic underpinnings of both rare and common neurodevelopmental disorders that cannot be currently identified using human populations
Disadvantaged students' academic performance: analysing the zone of proximal development
The aim of the study is to investigate the practical application of Vygotsky's construct of the Zone of Proximal Development to the selection of disadvantaged students in higher education. There is a need in post-apartheid South Africa, with its legacy of inequality in educational experiences, to find accurate and fair predictors of academic performance that would act as alternatives to matriculation marks and static tests. The study relates the students' response to mediation to their academic performance and analyses the role that non-cognitive factors such as motivation, approaches to learning and learning strategies play in cognitive performance. The investigation was done in the form of different studies using over 400 first year students at the Peninsula Technikon as subjects. The first study focused on the effectiveness of the mediated lessons that form part of the two dynamic tests using a Solomon Four Group and a Two Group design. The second study made a comparison between the predictive validity of past academic achievement conventional static tests, several non-cognitive variables as well as the two dynamic tests. In the third study the students' response to a period of mediation was analysed. The fourth study focused on comparing different groups of students according to the following classification: schooling, gender, language, type of course and assessment and level of course to see whether any of the variables would have a moderator effect Finally a differention was made between the profiles of more successful as opposed to less successful students. The weight of evidence of the study indicates that it is possible to find alternatives to matriculation marks and static tests in selecting disadvantaged students by making use of the concept of the Zone of Proximal Development The results further showed that disadvantaged students are not a homogeneous group. Although the matriculation marks seemed to be the best single predictor of academic performance for the total group of students, alternative predictors were identified when looking at different subgroups. Modifiability (students' response to mediation) had a moderator effect on the predictive power of various variables. For the less modifiable group of students, the matriculation marks and, to a certain extent, static tests were good predictors, while for the more modifiable group of students a dynamic test proved to be a significant predictor of academic performance. The implications of the findings for the selection and academic development of disadvantaged students are discussed
Development and application of methodologies and infrastructures for cancer genome analysis within Personalized Medicine
[eng] Next-generation sequencing (NGS) has revolutionized biomedical sciences, especially in the area of cancer. It has nourished genomic research with extensive collections of sequenced genomes that are investigated to untangle the molecular bases of disease, as well as to identify potential targets for the design of new treatments. To exploit all this information, several initiatives have emerged worldwide, among which the Pan-Cancer project of the ICGC (International Cancer Genome Consortium) stands out. This project has jointly analyzed thousands of tumor genomes of different cancer types in order to elucidate the molecular bases of the origin and progression of cancer. To accomplish this task, new emerging technologies, including virtualization systems such as virtual machines or software containers, were used and had to be adapted to various computing centers. The portability of this system to the supercomputing infrastructure of the BSC (Barcelona Supercomputing Center) has been carried out during the first phase of the thesis. In parallel, other projects promote the application of genomics discoveries into the clinics. This is the case of MedPerCan, a national initiative to design a pilot project for the implementation of personalized medicine in oncology in Catalonia. In this context, we have centered our efforts on the methodological side, focusing on the detection and characterization of somatic variants in tumors. This step is a challenging action, due to the heterogeneity of the different methods, and an essential part, as it lays at the basis of all downstream analyses.
On top of the methodological section of the thesis, we got into the biological interpretation of the results to study the evolution of chronic lymphocytic leukemia (CLL) in a close collaboration with the group of Dr. ElÃas Campo from the Hospital ClÃnic/IDIBAPS. In the first study, we have focused on the Richter transformation (RT), a transformation of CLL into a high-grade lymphoma that
leads to a very poor prognosis and with unmet clinical needs. We found that RT has greater genomic, epigenomic and transcriptomic complexity than CLL. Its genome may reflect the imprint of therapies that the patients received prior to RT, indicating the presence of cells exposed to these mutagenic treatments which later expand giving rise to the clinical manifestation of the disease. Multiple NGS- based techniques, including whole-genome sequencing and single-cell DNA and RNA sequencing, among others, confirmed the pre-existence of cells with the RT characteristics years before their manifestation, up to the time of CLL diagnosis. The transcriptomic profile of RT is remarkably different from that of CLL. Of particular importance is the overexpression of the OXPHOS pathway, which could be used as a therapeutic vulnerability. Finally, in a second study, the analysis of a case of CLL in a young adult, based on whole genome and single-cell sequencing at different times of the disease, revealed that the founder clone of CLL did not present any somatic driver mutations and was characterized by germline variants in ATM, suggesting its role in the origin of the disease, and highlighting the possible contribution of germline variants or other non-genetic mechanisms in the initiation of CLL
Analysis of whole-genome sequencing data from ICGC-PanCancer project
Cancer is one of the greatest health challenges of the 21st century and one of the deadliest diseases in the world. It is a group of different diseases which are caused by abnormal cell growth. In the human body, cell division and apoptosis are well regulated under normal circumstances so that the number of cells is in a dynamic balance. However, normal cells could transform into tumor cells because of genetic mutations. The tumorigenesis can happen in almost any cell of the human body. One of the central tools to address cancer is the profiling of cancer cell genomes and transcriptomes by next generation sequencing (NGS) and subsequent analysis by computational methods.
The Pan-Cancer Analysis of Whole Genomes (PCAWG) project is the core project of the International Cancer Genome Consortium. This project provides massive amounts of cancer biological data for analysis. Include more than 2900 patients and 48 types of cancer samples. As part of this intensive effort, I have conducted a very detailed analysis on the molecular mechanisms of cancers. In particular, I conducted a comprehensive study of the relationship between genomic mutations and cancer development. These series of studies include the exploration of cancer driver genes, analysis of telomere maintenance mechanisms and data visualization at the cohort level.
First, I explored potential cancer genes by performing statistical analysis of genomic point mutations, insertions and deletions, copy number variations and structural variations. Further, I analyzed the distribution of point mutations and structure variations in cancer genomes. Based on Knudson's two-hit hypothesis, I integrated point mutation and copy number variation information to construct a biallelic inactivation map of the cancer genome. With the biallelic inactivation information, I analyzed potential cancer drivers and applied this finding to synthetic lethality assays associated with cancer driver genes to uncover novel genetic targets that could be used to treat cancer patients with certain driver gene defects. In addition, I designed and improved the CaSINo model to score the relative mutation frequency of chromosomal sequences to screen for potential cancer driver mutations, which can be used not only in coding genes but also in non-coding regions. Moreover, I analyzed point mutations on promoters, trying to find those mutation sites that play a key role in the up-regulation of gene expression. Finally, I designed and improved a scoring method for copy number variation focality to explore the association of focal copy number variation with cancer driver genes at the cohort level.
Second, as part of the PCAWG research projects, I analyzed the mechanisms of telomere maintenance in cancer cells. After analyzing the differences between alternative telomere lengthening and telomerase-positive samples, I designed a machine learning model based on repeat sequences, content, and mutation rate to determine whether an unknown cancer sample is an alternative lengthening of telomere (ALT) or telomerase-positive.
Finally, for the massive data of the PCAWG project, I designed and implemented two bioinformatics visualization tools. TumorPrint is software in R and shell, which can be used to visualize genomic mutations and RNA-seq expression levels of a single gene or gene pairs, allowing users to quickly search for genes or gene pairs of interest. GenomeTornadoPlot is a software written in the R language for visualizing focal copy number variants of a single gene or adjacent paired genes, and can automatically calculate its copy number variation aggregation score
Birth, Death and Diversity: Using genomes and genomics to investigate evolution of the marsupial MHC
The major histocompatibility complex (MHC) is an immune gene family involved in the vertebrate immune response. Class I and class II genes have roles in resistance to disease and show high levels of diversity. MHC genes evolve through a birth and death process with class I genes evolving faster than class II genes. Marsupials are an interesting study system as they give birth to highly altricial and immunologically naïve young. The number of reference genomes available for marsupials has increased and it is now possible to bioinformatically annotate and compare the repertoire of MHC genes and investigate functional diversity in a number of species. Koalas are an iconic Australian marsupial threatened by two pathogens, Chlamydia pecorum and koala retrovirus (KoRV) and are currently listed as ‘Endangered’, making research into their immune system imperative for conservation of the species.
This thesis investigates the birth, death and diversity of MHC genes in marsupials. This thesis provides a workflow for investigating evolution and diversity of any gene family in any wildlife species. I was able to achieve this by: i) tracing patterns of gene gain and loss in class II MHC genes across the marsupial lineage (29 species), ii) determine the minimum sequence depth required to accurately genotype MHC genes, iii) identify associations between variation in immune genes, and disease progression using koalas and Chlamydia and iv) investigate variation in SNPs and copy number within MHC genes of koalas.
Overall, my thesis demonstrates the power of genomic technologies to investigate the birth, death, and diversity of MHC genes. By leveraging existing genomic resources and investigating sequencing and analysis methods, I was able to identify patterns of gene gain and loss, investigate the role of MHC diversity in disease resistance, and measure diversity across the entire range of koalas
Recommended from our members
Compounds and methods for treating, detecting, and identifying compounds to treat apicomplexan parasitic diseases
Disclosed herein; are novel compounds for treating apicomplexan parasite related disorders, methods for their use; cell line and non-human animal models of the dormant parasite phenotype and methods for their use in identifying new drugs to treat apicomplexan parasite related disorders, and biomarkers to identify disease due to the parasite and its response to treatment
From Mouse Models to Patients: A Comparative Bioinformatic Analysis of HFpEF and HFrEF
Heart failure (HF) represents an immense health burden with currently no curative
therapeutic strategies. Study of HF patient heterogeneity has led to the recognition of
HF with preserved (HFpEF) and reduced ejection fraction (HFrEF) as distinct syndromes
regarding molecular characteristics and clinical presentation. Until the recent past,
HFrEF represented the focus of research, reflected in the development of a number of
therapeutic strategies. However, the pathophysiological concepts applicable to HFrEF
may not be necessarily applicable to HFpEF. HF induces a series of ventricular
modeling processes that involve, among others, hallmarks of hypertrophy, fibrosis,
inflammation, all of which can be observed to some extent in HFpEF and HFrEF. Thus,
by direct comparative analysis between HFpEF and HFrEF, distinctive features can be
uncovered, possibly leading to improved pathophysiological understanding and
opportunities
for
therapeutic
intervention.
Moreover,
recent
advances
in
biotechnologies, animal models, and digital infrastructure have enabled large-scale
collection of molecular and clinical data, making it possible to conduct a bioinformatic
comparative analysis of HFpEF and HFrEF.
Here, I first evaluated the field of HF transcriptome research by revisiting published
studies and data sets to provide a consensus gene expression reference. I discussed the
patient clientele that was captured, revealing that HFpEF patients were not represented.
Thus, I applied alternative approaches to study HFpEF. I utilized a mouse surrogate
model of HFpEF and analyzed single cell transcriptomics to gain insights into the
interstitial tissue remodeling. I contrasted this analysis by comparison of fibroblast
activation patterns found in mouse models resembling HFrEF. The human reference
was used to further demonstrate similarities between models and patients and a novel
possible biomarker for HFpEF was introduced.
Mouse models only capture selected aspects of HFpEF but largely fail to imitate the
complex multi-factor and multi-organ syndrome present in humans. To account for
this complexity, I performed a top-down analysis in HF patients by analyzing
phenome-wide comorbidity patterns. I derived clinical insights by contrasting HFpEF
and HFrEF patients and their comorbidity profiles. These profiles were then used to
predict associated genetic profiles, which could be also recovered in the HFpEF mouse
model, providing hypotheses about the molecular links of comorbidity profiles.
My work provided novel insights into HFpEF and HFrEF syndromes and exemplified an
interdisciplinary bioinformatic approach for a comparative analysis of both syndromes
using different data modalities
- …