46 research outputs found

    Identifying disease-associated genes based on artificial intelligence

    Get PDF
    Identifying disease-gene associations can help improve the understanding of disease mechanisms, which has a variety of applications, such as early diagnosis and drug development. Although experimental techniques, such as linkage analysis, genome-wide association studies (GWAS), have identified a large number of associations, identifying disease genes is still challenging since experimental methods are usually time-consuming and expensive. To solve these issues, computational methods are proposed to predict disease-gene associations. Based on the characteristics of existing computational algorithms in the literature, we can roughly divide them into three categories: network-based methods, machine learning-based methods, and other methods. No matter what models are used to predict disease genes, the proper integration of multi-level biological data is the key to improving prediction accuracy. This thesis addresses some limitations of the existing computational algorithms, and integrates multi-level data via artificial intelligence techniques. The thesis starts with a comprehensive review of computational methods, databases, and evaluation methods used in predicting disease-gene associations, followed by one network-based method and four machine learning-based methods. The first chapter introduces the background information, objectives of the studies and structure of the thesis. After that, a comprehensive review is provided in the second chapter to discuss the existing algorithms as well as the databases and evaluation methods used in existing studies. Having the objectives and future directions, the thesis then presents five computational methods for predicting disease-gene associations. The first method proposed in Chapter 3 considers the issue of non-disease gene selection. A shortest path-based strategy is used to select reliable non-disease genes from a disease gene network and a differential network. The selected genes are then used by a network-energy model to improve its performance. The second method proposed in Chapter 4 constructs sample-based networks for case samples and uses them to predict disease genes. This strategy improves the quality of protein-protein interaction (PPI) networks, which further improves the prediction accuracy. Chapter 5 presents a generic model which applies multimodal deep belief nets (DBN) to fuse different types of data. Network embeddings extracted from PPI networks and gene ontology (GO) data are fused with the multimodal DBN to obtain cross-modality representations. Chapter 6 presents another deep learning model which uses a convolutional neural network (CNN) to integrate gene similarities with other types of data. Finally, the fifth method proposed in Chapter 7 is a nonnegative matrix factorization (NMF)-based method. This method maps diseases and genes onto a lower-dimensional manifold, and the geodesic distance between diseases and genes are used to predict their associations. The method can predict disease genes even if the disease under consideration has no known associated genes. In summary, this thesis has proposed several artificial intelligence-based computational algorithms to address the typical issues existing in computational algorithms. Experimental results have shown that the proposed methods can improve the accuracy of disease-gene prediction

    Leveraging single-cell genomics to uncover clinical and preclinical responses to cancer immunotherapy

    Get PDF
    Immune checkpoint inhibitors (ICIs) provide durable clinical responses in about 20% of cancer patients, but have been largely ineffective for non-immunogenic cancers that lack intratumoral T cells. Most tumors have somatic mutations that encode for mutant proteins that are tumor-specific and not expressed on normal cells (termed neoantigens). Cancers, such as melanoma, with the highest mutational burdens are more likely to respond to single agent ICIs. However, most cancers, including pancreatic ductal adenocarcinoma (PDAC), have lower mutational loads, resulting in fewer T cells infiltrating the tumor. Studies have previously demonstrated that an allogeneic GM-CSF-based vaccine enhances T cell infiltration into human pancreatic cancer. Recent work with Panc02 cells, which express around 60 neoantigens similar to human PDAC, showed that PancVAX, a neoantigen-targeted vaccine, when paired with immune modulators cleared tumors in Panc02-bearing mice. This data suggests that cancer vaccines targeting tumor neoantigens induce neoepitope-specific T cells, which can be further activated by ICIs, leading to tumor rejection. Currently, the impact of ICIs and neoantigen-targeted vaccines on immune cell expression states and the underlying mechanism of therapeutic response remains poorly defined. Comprehensive characterization of responding immune cells, particularly T cells, will be critical in understanding mechanisms of response and providing a rationale for combinatorial therapies. In this work, we develop innovative computational methods and analysis pipelines to analyze the tumor-immune microenvironment at single-cell resolution. We establish an algorithm to quantify differential heterogeneity in single-cell RNA-seq data, demonstrate the use of non-negative matrix factorization and transfer learning algorithms to identify previously unknown and conserved ICI responses between species, and develop a novel algorithm to physicochemically compare single-cell T cell receptor sequences. We leverage these methods in various contexts to yield new insight into the biological mechanisms underlying positive immunotherapeutic responses in diverse tumor types, including PDAC

    2023 Medical Student Research Day Abstracts

    Get PDF
    Medical student research day is designed to highlight the breadth of research and scholarly activity that medical students have accomplished during their education at The GW School of Medicine and Health Sciences. All medical students are invited to present research regardless of the area of focus. Abstract submissions represent a broad range of research interests and disciplines, including basic and translational science, clinical research, health policy and public health research, and education-related research

    Histopathology-selective spatial oncogenic phenotypes in non-small cell lung cancer

    Get PDF
    Non-small cell lung cancer (NSCLC) constitutes over 85% of lung cancer. Histologically, NSCLC can be broadly classified into adenocarcinoma (AC), squamous cell carcinoma (SCC), large cell carcinoma (LCC), and adenosquamous carcinoma (ASC). AC represents about 65% of all NSCLC cases, and it can be further subdivided based on tumor size and primary growth patterns, such as papillary, acinar, and mucinous. The formation of NSCLC histotypes is orchestrated by cells of origin, genetic alterations, and microenvironmental properties. Although NSCLC carries significant heterogeneity, some genetic mutations, functional phenotypes, and therapeutic responses are associated with specific NSCLC histotypes. Therefore, understanding histotype-selective etiology becomes essential for mechanistic studies and therapeutic applications in the NSCLC research field. Image-based tissue phenotyping has been commonly used for histological classification. It also allows the direct visualization of the distribution and expression of functional molecules. Quantifying such in situ phenotypes can be applied to hypothesis-based functional studies or data-driven correlative analyses. The first part of this thesis developed a spatial image analysis tool package. The making of Spa-RQ, an open-source tool package for image registration and quantification, reflected on the need to perform spatial phenotyping using serial tissue sections in a standardized laboratory workflow. Subsequently, we applied Spa-RQ to identify the histotype-selective, rather than genetically defined activation of MAPK, AKT, and mTOR signaling pathways in murine and human NSCLC samples. The diverse co-activation patterns between these pathways in different tissue compartments, measured by marker expression overlapping using Spa-RQ, may associate with heterogeneous responses towards combinatorial targeted therapies. The second part of this thesis work investigated the histotype-selective functions of a potential therapeutic target. The lung developmental transcription factor SOX9 is silenced in normal adult lung epithelia while it is re-expressed in NSCLC tissues. Its oncogenicity is widely acknowledged but has thus far not been confirmed in NSCLC subtypes. Analyzing the correlation between SOX9 expression and histotype-specific clinical staging, survival, and invasiveness revealed a clinical significance for increased SOX9 expression only in non-mucinous ACs, despite its broad expression in ASC, SCC, and mucinous AC. Supporting this, by comparing the histotype spectra in mouse models following Sox9 loss, we identified a critical role of SOX9 in promoting lung papillary AC progression. On the other hand, its expression was not required for developing squamous and mucinous structure tissues. Finally, using spatial phenotyping, we explained such opposing roles of SOX9 in NSCLC subtypes by the different cells of origin and microenvironmental properties: SOX9 expression was required to form advanced AC from the lung alveolar progenitor cells; on the contrary, its expression was dispensable for SCC development and even interfered with squamous metastasis. Therefore, this work exposed SOX9 as a potential drug target specific to a subgroup of lung AC. In summary, the identification of histotype-selective functional oncogenic phenotypes, as achieved in this thesis, contributes to understanding the heterogeneous nature of tumorigenesis, cancer progression, and drug sensitivities.Non-small cell lung cancer (NSCLC) constitutes over 85% of lung cancer. Histologically, NSCLC can be broadly classified into adenocarcinoma (AC), squamous cell carcinoma (SCC), large cell carcinoma (LCC), and adenosquamous carcinoma (ASC). AC represents about 65% of all NSCLC cases, and it can be further subdivided based on tumor size and primary growth patterns, such as papillary, acinar, and mucinous. The formation of NSCLC histotypes is orchestrated by cells of origin, genetic alterations, and microenvironmental properties. Although NSCLC carries significant heterogeneity, some genetic mutations, functional phenotypes, and therapeutic responses are associated with specific NSCLC histotypes. Therefore, understanding histotype-selective etiology becomes essential for mechanistic studies and therapeutic applications in the NSCLC research field. Image-based tissue phenotyping has been commonly used for histological classification. It also allows the direct visualization of the distribution and expression of functional molecules. Quantifying such in situ phenotypes can be applied to hypothesis-based functional studies or data-driven correlative analyses. The first part of this thesis developed a spatial image analysis tool package. The making of Spa-RQ, an open-source tool package for image registration and quantification, reflected on the need to perform spatial phenotyping using serial tissue sections in a standardized laboratory workflow. Subsequently, we applied Spa-RQ to identify the histotype-selective, rather than genetically defined activation of MAPK, AKT, and mTOR signaling pathways in murine and human NSCLC samples. The diverse co-activation patterns between these pathways in different tissue compartments, measured by marker expression overlapping using Spa-RQ, may associate with heterogeneous responses towards combinatorial targeted therapies. The second part of this thesis work investigated the histotype-selective functions of a potential therapeutic target. The lung developmental transcription factor SOX9 is silenced in normal adult lung epithelia while it is re-expressed in NSCLC tissues. Its oncogenicity is widely acknowledged but has thus far not been confirmed in NSCLC subtypes. Analyzing the correlation between SOX9 expression and histotype-specific clinical staging, survival, and invasiveness revealed a clinical significance for increased SOX9 expression only in non-mucinous ACs, despite its broad expression in ASC, SCC, and mucinous AC. Supporting this, by comparing the histotype spectra in mouse models following Sox9 loss, we identified a critical role of SOX9 in promoting lung papillary AC progression. On the other hand, its expression was not required for developing squamous and mucinous structure tissues. Finally, using spatial phenotyping, we explained such opposing roles of SOX9 in NSCLC subtypes by the different cells of origin and microenvironmental properties: SOX9 expression was required to form advanced AC from the lung alveolar progenitor cells; on the contrary, its expression was dispensable for SCC development and even interfered with squamous metastasis. Therefore, this work exposed SOX9 as a potential drug target specific to a subgroup of lung AC. In summary, the identification of histotype-selective functional oncogenic phenotypes, as achieved in this thesis, contributes to understanding the heterogeneous nature of tumorigenesis, cancer progression, and drug sensitivity

    Molecular methods for the detection of infectious diseases: bringing diagnostics to the point-of-care

    Get PDF
    Human infectious diseases represent a leading cause of morbidity and mortality globally, caused by human-infective pathogens such as bacteria, viruses, parasites or fungi. Point-of-care (POC) diagnostics allow accessible, simple, and rapid identification of the organism causing the infection which is crucial for successful prognostic outcomes, clinical management, surveillance and isolation. The research conducted in this thesis aims to investigate novel methods for molecular-based diagnostics. This multidisciplinary project is divided into three main sections: (i) molecular methods for enhanced nucleic acid amplification, (ii) POC technologies, and (iii) sample preparation. The application, design and optimisation of loop-mediated isothermal amplification (LAMP) is investigated from a molecular perspective for the diagnostics of emerging infectious pathogens and antimicrobial resistance. LAMP assays were designed to target pathogens responsible for parasitic (malaria), bacterial and viral (COVID-19) infections, as well as antimicrobial resistance. A novel LAMP-based method for the detection of single nucleotide polymorphisms was developed and applied for diagnostics of antimicrobial resistance, emerging variants and genetic disorders. The method was validated for the detection of artemisinin-resistant malaria. Furthermore, this thesis reports the optimisation of LAMP from a biochemical perspective through the evaluation of its core reagents and the incorporation of enhancing agents to improve its specificity and sensitivity. In order to remove cold-chain storage from the diagnostic workflow, the optimised LAMP protocol was designed to be compatible with lyophilisation. Translation of LAMP to the POC demands the development of detection technologies that are compatible with the advantages offered by isothermal amplification. The use of simple, accessible and portable technologies is investigated in this thesis through the development of: (i) a novel colorimetricLAMP detection method for end-point and low cost detection, and (ii) the combination of LAMP with an electrochemical biosensing platform based on ion-sensitive field effect transistors (ISFETs) fabricated in unmodified complementary metal-oxide semiconductor (CMOS) technology for real-time detection. Lastly, current nucleic acid extraction methods are not transferable to be used outside the laboratory. Research of novel methods for low-cost and electricity-free sample preparation was carried out using cellulose matrices. A novel, rapid (under 10 min) and efficient nucleic acid extraction method from dried blood spots was developed. A sample-to-result POC test requires the implementation and integration of molecular biology, cutting-edge technology and data-driven approaches. The work presented in this thesis aims to set new benchmarks for the detection of infectious diseases at the POC by leveraging on developments in molecular biology and digital technologies.Open Acces

    Epigenetic age prediction and rejuvenation

    Get PDF
    Ageing is a complex, multi-faceted process that afflicts all humans. It invariably increases susceptibility to a range of diseases such as cancer and neurological disorders. Drugs that mimic calorie restriction show promise in slowing down ageing, but very few treatments appear to be able to actively reverse ageing. Partially reprogrammed stem cells have shown potential as an anti-ageing therapy when used to safely rejuvenate mice without tumour incidence. The question remained as to what exactly occurred at a cellular level. Were a subpopulation of cells dedifferentiating, or partially dedifferentiating and causing a rejuvenative effect by being more stem-like? Or, were the cells epigenetically rejuvenated, where cells became more youthful without loss of somatic cell identity? To test either of these hypotheses, two biomarkers were required to track (i) biological ageing and (ii) dedifferentiation state. By analysing a previously published dataset of fibroblasts dedifferentiating to induced pluripotent stem cells (iPSCs) over a 49-day time-course, I helped assess the dynamics of cellular ageing. Epigenetic age was used a proxy for biological age, while RNA microarray data was used to assess the state of dedifferentiation (ie. by comparing fibroblast specific gene expression with pluripotency gene markers). Partially reprogrammed cells (between days 7 and 15 of dedifferentiation) declined in predicted age (also known as epigenetic age, eAge), while somatic cell identity was maintained. This shows that loss of somatic gene expression and epigenetic age follow different kinetics, suggesting that they can be uncoupled and a possible “safe period” exists where rejuvenation can be achieved with a minimized risk of cancer. While epigenetic clocks appear to confer biological age in many respects, their true underlying function remains a mystery, and the precise aspects of ageing they capture is unclear. For example, differences between epigenetic age and chronological age that are associated with ageing disease states, could be caused by biological and technical biases. Biological biases can arise from mutations affecting the DNA methylation machinery, resulting in global sweeps in methylation. Technical biases may arise from errors in bisulphite (BS) conversion, which could cause a slight overestimation in percentage methylation and therefore alter eAge estimates. To explore how robust the epigenetic clocks are to sweeps of global methylation, incremental increases and decreases of global methylation were simulated in a large cohort. I showed that epigenetic clocks are not impervious to gradual, global changes in methylation. I also showed how discrete alterations in methylation state can cause a significant difference in eAge compared to a control group, which conceivably could occur in experiments testing rare genetic diseases. I also present an epigenetic clock based on average methylation over genomic regions, rather than individual CpGs. This clock provides a more robust method of predicting age, which may pave the way for more accurate age predictors using mouse RRBS data. This thesis has demonstrated that epigenetic clocks are invaluable tools for exploring health-span extending therapies. However, caution must be taken when analysing epigenetic data, as mutations and technical issues may confound analysis. Nonetheless, epigenetic clocks have shown great potential in the molecular ageing field. By understanding the precise nature of eAge, avenues to achieve therapeutic anti-ageing therapies may also be achieved
    corecore