61 research outputs found

    Deep learning-based diagnostic system for malignant liver detection

    Get PDF
    Cancer is the second most common cause of death of human beings, whereas liver cancer is the fifth most common cause of mortality. The prevention of deadly diseases in living beings requires timely, independent, accurate, and robust detection of ailment by a computer-aided diagnostic (CAD) system. Executing such intelligent CAD requires some preliminary steps, including preprocessing, attribute analysis, and identification. In recent studies, conventional techniques have been used to develop computer-aided diagnosis algorithms. However, such traditional methods could immensely affect the structural properties of processed images with inconsistent performance due to variable shape and size of region-of-interest. Moreover, the unavailability of sufficient datasets makes the performance of the proposed methods doubtful for commercial use. To address these limitations, I propose novel methodologies in this dissertation. First, I modified a generative adversarial network to perform deblurring and contrast adjustment on computed tomography (CT) scans. Second, I designed a deep neural network with a novel loss function for fully automatic precise segmentation of liver and lesions from CT scans. Third, I developed a multi-modal deep neural network to integrate pathological data with imaging data to perform computer-aided diagnosis for malignant liver detection. The dissertation starts with background information that discusses the proposed study objectives and the workflow. Afterward, Chapter 2 reviews a general schematic for developing a computer-aided algorithm, including image acquisition techniques, preprocessing steps, feature extraction approaches, and machine learning-based prediction methods. The first study proposed in Chapter 3 discusses blurred images and their possible effects on classification. A novel multi-scale GAN network with residual image learning is proposed to deblur images. The second method in Chapter 4 addresses the issue of low-contrast CT scan images. A multi-level GAN is utilized to enhance images with well-contrast regions. Thus, the enhanced images improve the cancer diagnosis performance. Chapter 5 proposes a deep neural network for the segmentation of liver and lesions from abdominal CT scan images. A modified Unet with a novel loss function can precisely segment minute lesions. Similarly, Chapter 6 introduces a multi-modal approach for liver cancer variants diagnosis. The pathological data are integrated with CT scan images to diagnose liver cancer variants. In summary, this dissertation presents novel algorithms for preprocessing and disease detection. Furthermore, the comparative analysis validates the effectiveness of proposed methods in computer-aided diagnosis

    Genetic Algorithms for Feature Selection and Classification of Complex Chromatographic and Spectroscopic Data

    Get PDF
    A basic methodology for analyzing large multivariate chemical data sets based on feature selection is proposed. Each chromatogram or spectrum is represented as a point in a high dimensional measurement space. A genetic algorithm for feature selection and classification is applied to the data to identify features that optimize the separation of the classes in a plot of the two or three largest principal components of the data. A good principal component plot can only be generated using features whose variance or information is primarily about differences between classes in the data. Hence, feature subsets that maximize the ratio of between-class to within-class variance are selected by the pattern recognition genetic algorithm. Furthermore, the structure of the data set can be explored, for example, new classes can be discovered by simply tuning various parameters of the fitness function of the pattern recognition genetic algorithm. The proposed method has been validated on a wide range of data. A two-step procedure for pattern recognition analysis of spectral data has been developed. First, wavelets are used to denoise and deconvolute spectral bands by decomposing each spectrum into wavelet coefficients, which represent the samples constituent frequencies. Second, the pattern recognition genetic algorithm is used to identify wavelet coefficients characteristic of the class. In several studies involving spectral library searching, this method was employed. In one study, a search pre-filter to detect the presence of carboxylic acids from vapor phase infrared spectra which has previously eluted prominent researchers has been successfully formulated and validated. In another study, this same approach has been used to develop a pattern recognition assisted infrared library searching technique to determine the model, manufacturer, and year of the vehicle from which a clear coat paint smear originated. The pattern recognition genetic algorithm has also been used to develop a potential method to identify molds in indoor environments using volatile organic compounds. A distinct profile indicative of microbial volatile organic compounds was developed from air sampling data that could be readily differentiated from the blank for both high mold count and moderate mold count exposure samples. The utility of the pattern recognition genetic algorithm for discovery of biomarker candidates from genomic and proteomic data sets has also been shown.Chemistry Departmen

    Multivariate classification of gene expression microarray data

    Get PDF
    L'expressiódels gens obtinguts de l'anàliside microarrays s'utilitza en molts casos, per classificar les cèllules. En aquestatesi, unaversióprobabilística del mètodeDiscriminant Partial Least Squares (p-DPLS)s'utilitza per classificar les mostres de les expressions delsseus gens. p-DPLS esbasa en la regla de Bayes de la probabilitat a posteriori. Aquestsclassificadorssónforaçats a classficarsempre.Per superaraquestalimitaciós'haimplementatl'opció de rebuig.Aquestaopciópermetrebutjarlesmostresamb alt riscd'errors de classificació (és a dir, mostresambigüesi outliers).Aquestaopció de rebuigcombinacriterisbasats en els residuals x, el leverage ielsvalorspredits. A més,esdesenvolupa un mètode de selecció de variables per triarels gens mésrellevants, jaque la majoriadels gens analitzatsamb un microarraysónirrellevants per al propòsit particular de classificacióI podenconfondre el classificador. Finalment, el DPLSs'estenen a la classificació multi-classemitjançant la combinació de PLS ambl'anàlisidiscriminant lineal

    Knowledge Management Approaches for predicting Biomarker and Assessing its Impact on Clinical Trials

    Get PDF
    The recent success of companion diagnostics along with the increasing regulatory pressure for better identification of the target population has created an unprecedented incentive for the drug discovery companies to invest into novel strategies for stratified biomarker discovery. Catching with this trend, trials with stratified biomarker in drug development have quadrupled in the last decade but represent a small part of all Interventional trials reflecting multiple co-developmental challenges of therapeutic compounds and companion diagnostics. To overcome the challenge, varied knowledge management and system biology approaches are adopted in the clinics to analyze/interpret an ever increasing collection of OMICS data. By semi-automatic screening of more than 150,000 trials, we filtered trials with stratified biomarker to analyse their therapeutic focus, major drivers and elucidated the impact of stratified biomarker programs on trial duration and completion. The analysis clearly shows that cancer is the major focus for trials with stratified biomarker. But targeted therapies in cancer require more accurate stratification of patient population. This can be augmented by a fresh approach of selecting a new class of biomolecules i.e. miRNA as candidate stratification biomarker. miRNA plays an important role in tumorgenesis in regulating expression of oncogenes and tumor suppressors; thus affecting cell proliferation, differentiation, apoptosis, invasion, angiogenesis. miRNAs are potential biomarkers in different cancer. However, the relationship between response of cancer patients towards targeted therapy and resulting modifications of the miRNA transcriptome in pathway regulation is poorly understood. With ever-increasing pathways and miRNA-mRNA interaction databases, freely available mRNA and miRNA expression data in multiple cancer therapy have created an unprecedented opportunity to decipher the role of miRNAs in early prediction of therapeutic efficacy in diseases. We present a novel SMARTmiR algorithm to predict the role of miRNA as therapeutic biomarker for an anti-EGFR monoclonal antibody i.e. cetuximab treatment in colorectal cancer. The application of an optimised and fully automated version of the algorithm has the potential to be used as clinical decision support tool. Moreover this research will also provide a comprehensive and valuable knowledge map demonstrating functional bimolecular interactions in colorectal cancer to scientific community. This research also detected seven miRNA i.e. hsa-miR-145, has-miR-27a, has- miR-155, hsa-miR-182, hsa-miR-15a, hsa-miR-96 and hsa-miR-106a as top stratified biomarker candidate for cetuximab therapy in CRC which were not reported previously. Finally a prospective plan on future scenario of biomarker research in cancer drug development has been drawn focusing to reduce the risk of most expensive phase III drug failures

    Role of machine learning in early diagnosis of kidney diseases.

    Get PDF
    Machine learning (ML) and deep learning (DL) approaches have been used as indispensable tools in modern artificial intelligence-based computer-aided diagnostic (AIbased CAD) systems that can provide non-invasive, early, and accurate diagnosis of a given medical condition. These AI-based CAD systems have proven themselves to be reproducible and have the generalization ability to diagnose new unseen cases with several diseases and medical conditions in different organs (e.g., kidneys, prostate, brain, liver, lung, breast, and bladder). In this dissertation, we will focus on the role of such AI-based CAD systems in early diagnosis of two kidney diseases, namely: acute rejection (AR) post kidney transplantation and renal cancer (RC). A new renal computer-assisted diagnostic (Renal-CAD) system was developed to precisely diagnose AR post kidney transplantation at an early stage. The developed Renal-CAD system perform the following main steps: (1) auto-segmentation of the renal allograft from surrounding tissues from diffusion weighted magnetic resonance imaging (DW-MRI) and blood oxygen level-dependent MRI (BOLD-MRI), (2) extraction of image markers, namely: voxel-wise apparent diffusion coefficients (ADCs) are calculated from DW-MRI scans at 11 different low and high b-values and then represented as cumulative distribution functions (CDFs) and extraction of the transverse relaxation rate (R2*) values from the segmented kidneys using BOLD-MRI scans at different echotimes, (3) integration of multimodal image markers with the associated clinical biomarkers, serum creatinine (SCr) and creatinine clearance (CrCl), and (4) diagnosing renal allograft status as nonrejection (NR) or AR by utilizing these integrated biomarkers and the developed deep learning classification model built on stacked auto-encoders (SAEs). Using a leaveone- subject-out cross-validation approach along with SAEs on a total of 30 patients with transplanted kidney (AR = 10 and NR = 20), the Renal-CAD system demonstrated 93.3% accuracy, 90.0% sensitivity, and 95.0% specificity in differentiating AR from NR. Robustness of the Renal-CAD system was also confirmed by the area under the curve value of 0.92. Using a stratified 10-fold cross-validation approach, the Renal-CAD system demonstrated its reproduciblity and robustness with a diagnostic accuracy of 86.7%, sensitivity of 80.0%, specificity of 90.0%, and AUC of 0.88. In addition, a new renal cancer CAD (RC-CAD) system for precise diagnosis of RC at an early stage was developed, which incorporates the following main steps: (1) estimating the morphological features by applying a new parametric spherical harmonic technique, (2) extracting appearance-based features, namely: first order textural features are calculated and second order textural features are extracted after constructing the graylevel co-occurrence matrix (GLCM), (3) estimating the functional features by constructing wash-in/wash-out slopes to quantify the enhancement variations across different contrast enhanced computed tomography (CE-CT) phases, (4) integrating all the aforementioned features and modeling a two-stage multilayer perceptron artificial neural network (MLPANN) classifier to classify the renal tumor as benign or malignant and identify the malignancy subtype. On a total of 140 RC patients (malignant = 70 patients (ccRCC = 40 and nccRCC = 30) and benign angiomyolipoma tumors = 70), the developed RC-CAD system was validated using a leave-one-subject-out cross-validation approach. The developed RC-CAD system achieved a sensitivity of 95.3% ± 2.0%, a specificity of 99.9% ± 0.4%, and Dice similarity coefficient of 0.98 ± 0.01 in differentiating malignant from benign renal tumors, as well as an overall accuracy of 89.6% ± 5.0% in the sub-typing of RCC. The diagnostic abilities of the developed RC-CAD system were further validated using a randomly stratified 10-fold cross-validation approach. The results obtained using the proposed MLP-ANN classification model outperformed other machine learning classifiers (e.g., support vector machine, random forests, and relational functional gradient boosting) as well as other different approaches from the literature. In summary, machine and deep learning approaches have shown potential abilities to be utilized to build AI-based CAD systems. This is evidenced by the promising diagnostic performance obtained by both Renal-CAD and RC-CAD systems. For the Renal- CAD, the integration of functional markers extracted from multimodal MRIs with clinical biomarkers using SAEs classification model, potentially improved the final diagnostic results evidenced by high accuracy, sensitivity, and specificity. The developed Renal-CAD demonstrated high feasibility and efficacy for early, accurate, and non-invasive identification of AR. For the RC-CAD, integrating morphological, textural, and functional features extracted from CE-CT images using a MLP-ANN classification model eventually enhanced the final results in terms of accuracy, sensitivity, and specificity, making the proposed RC-CAD a reliable noninvasive diagnostic tool for RC. The early and accurate diagnosis of AR or RC will help physicians to provide early intervention with the appropriate treatment plan to prolong the life span of the diseased kidney, increase the survival chance of the patient, and thus improve the healthcare outcome in the U.S. and worldwide

    Characterization of cell type-specific molecular heterogeneity in cancer using multi-omic approaches

    Get PDF
    Tumors are composed of heterogeneous cell types each with its own unique molecular profiles. Recent advances in single cell genomics technologies have begun to increase our understanding of the molecular heterogeneity that exists in tumors with particular focus on gene expression and chromatin accessibility profiles. However, due to limitations in methods for certain sample types and high cost for single cell genomics, bulk tumor molecular profiling has been and remains widely used. In addition, other facets of single cell epigenomic profiling, particularly methylation and hydroxymethylation, remains underexplored. Thus, investigations to understand the cell type specific epigenetic heterogeneity and the cooperation among various molecular layers to regulate tumorigenesis are needed. In this thesis, I utilize a multi-omic approach integrating DNA methylation, hydroxymethylation, chromatin accessibility, and gene expression profiles to investigate unique single cell type-specific features in 1) epithelial-to-mesenchymal transition and in 2) pediatric central nervous system tumors. First, I demonstrate the shared and distinct epigenetic profiles that are associated with single cells undergoing epithelial-to-mesenchymal transition. With a multi-omic approach, I identify increased hydroxymethylation in binding motifs of transcription factors critical in regulating epithelial-to-mesenchymal transition. Then, I shift my focus to characterize the cellular heterogeneity in pediatric central nervous system tumors and transcriptomic alterations associated with these tumors, while accounting for cell type composition, with single nuclei gene expression data. I detect novel pediatric central nervous system tumor associated genes that are differentially expressed. Finally, I illustrate the cytosine modification alterations that occur predominantly in the progenitorlike cell types of pediatric central nervous system tumors with a multi-omic approach. I determine associations between cell type-specific hydroxymethylation alterations with cell type-specific gene expression changes. Together, these findings emphasize the need for consideration of cellular identity to determine molecular heterogeneity that exist in various cancer contexts. Moreover, these works collectively suggest the utility of multiomic approaches to uncover novel insights in underlying tumor biology

    Examining lipid metabolism of colorectal adenomas and carcinomas using Rapid Evaporative Ionisation Mass Spectrometry (REIMS)

    Get PDF
    Background There is an unmet need for real-time intraoperative colorectal tissue recognition, which would promote personalised oncologic decision making. Rapid Evaporative Ionization Mass Spectrometry (REIMS) analyses the composition of cellular lipids through the aerosol generated from electrosurgical instruments, providing a novel diagnostic platform and surgeon feedback. Thesis Hypothesis Colorectal lipid metabolism and cellular lipid composition are associated with the phenotype of colorectal adenomas and carcinomas, which can be leveraged for tissue recognition in vivo. Methods This thesis contains three work packages. First, a method for REIMS spectral quality control was developed based on a human dataset and analysis of a porcine model assessed the spectral impact of technical and environmental factors. Second, an ex vivo spectral reference database was constructed from analysis of human colorectal tissues, assessing the ability of REIMS for tissue recognition. Finally, REIMS was translated into the operating theatre, for proof-of-principle application of during transanal minimally invasive surgery (TAMIS). Results Sensitivity analyses revealed seven minimum quality criteria for REIMS spectra to be included in all future statistical analyses, with quality also impacted by low diathermy power, coagulation mode and tissue contamination. Based on tissue of 161 patients, REIMS could differentiate colorectal normal, adenoma and cancer tissue with 91.1% accuracy, and disease from normal with 93.5% accuracy. REIMS could risk-stratify adenomas by predicting grade of dysplasia, however not histological features of poor prognosis in cancers. 61 pertinent lipid metabolites were structurally identified. REIMS was coupled to TAMIS in seven patients. Optimisation of the workflow successfully increased signal intensity, with tissue recognition showing high accuracy in vivo and identification of a cancer-involved margin. Discussion This thesis demonstrates that REIMS can be optimised and applied for accurate real-time colorectal tissue recognition based on cellular lipid composition. This can be translated in vivo, with promising results during first-in-man mass spectrometry-coupled TAMIS.Open Acces

    Linkage, association, and haplotype analysis: A spectrum of approaches to elucidate the genetic influences of complex human traits

    Get PDF
    The goal of human genetics is to identify genetic variants that influence a certain trait with the intent to provide a better understanding of the biology behind that trait. As technologies and statistical methods towards this goal have developed, there has been a change in the approaches to identify trait-causing variants. The three projects reported here cover a range of approaches. Early studies focused on family-based data, using linkage analysis to find regions of the genome shared by members with similar trait values. This approach was used to confirm the involvement of CYP2E1 with the level of response to alcohol in sibling pairs with an alcoholic parent. With the advent of high through-put genotyping panels, the field of human genetics has shifted to population-based association studies that seek to find variants that correlate with a trait. This approach was used to search for regions of the genome that infer risk for Pick's disease, a spectrum of heterogeneous dementia diseases, and to reproduce the association with MAPT, a gene with known disease-causing mutations. Haplotype based analysis approaches have emerged to improve the analysis of genomic data. A novel algorithm for haplotype based analysis was developed to identify long haplotypes shared in a population based on genotypes from genome-wide association data and was found to be very accurate when predicting haplotypes within the shared regions. Together, these three projects represent the past, present, and future of the study of human genetics
    corecore