1,755 research outputs found

    Strong compound-risk factors: Efficient discovery through emerging patterns and contrast sets

    Full text link
    Odds ratio (OR), relative risk (RR) (risk ratio), and absolute risk reduction (ARR) (risk difference) are biostatistics measurements that are widely used for identifying significant risk factors in dichotomous groups of subjects. In the past, they have often been used to assess simple risk factors. In this paper, we introduce the concept of compound-risk factors to broaden the applicability of these statistical tests for assessing factor interplays. We observe that compound-risk factors with a high risk ratio or a big risk difference have an one-to-one correspondence to strong emerging patterns or strong contrast sets-two types of patterns that have been extensively studied in the data mining field. Such a relationship has been unknown to researchers in the past, and efficient algorithms for discovering strong compound-risk factors have been lacking. In this paper, we propose a theoretical framework and a new algorithm that unify the discovery of compound-risk factors that have a strong OR, risk ratio, or a risk difference. Our method guarantees that all patterns meeting a certain test threshold can be efficiently discovered. Our contribution thus represents the first of its kind in linking the risk ratios and ORs to pattern mining algorithms, making it possible to find compound-risk factors in large-scale data sets. In addition, we show that using compound-risk factors can improve classification accuracy in probabilistic learning algorithms on several disease data sets, because these compound-risk factors capture the interdependency between important data attributes. © 2007 IEEE

    A Preliminary Investigation towards the Risk Stratification of Allogeneic Stem Cell Recipients with Respect to the Potential for Development of GVHD via Their Pre-Transplant Plasma Lipid and Metabolic Signature

    Get PDF
    The clinical outcome of allogeneic hematopoietic stem cell transplantation (SCT) may be influenced by the metabolic status of the recipient following conditioning, which in turn may enable risk stratification with respect to the development of transplant-associated complications such as graft vs. host disease (GVHD). To better understand the impact of the metabolic profile of transplant recipients on post-transplant alloreactivity, we investigated the metabolic signature of 14 patients undergoing myeloablative conditioning followed by either human leukocyte antigen (HLA)-matched related or unrelated donor SCT, or autologous SCT. Blood samples were taken following conditioning and prior to transplant on day 0 and the plasma was comprehensively characterized with respect to its lipidome and metabolome via liquid chromatography/mass spectrometry (LCMS) and gas chromatography/mass spectrometry (GCMS). A pro-inflammatory metabolic profile was observed in patients who eventually developed GVHD. Five potential pre-transplant biomarkers, 2-aminobutyric acid, 1-monopalmitin, diacylglycerols (DG 38:5, DG 38:6), and fatty acid FA 20:1 demonstrated high sensitivity and specificity towards predicting post-transplant GVHD. The resulting predictive model demonstrated an estimated predictive accuracy of risk stratification of 100%, with area under the curve of the ROC of 0.995. The likelihood ratio of 1-monopalmitin (infinity), DG 38:5 (6.0), and DG 38:6 (6.0) also demonstrated that a patient with a positive test result for these biomarkers following conditioning and prior to transplant will be at risk of developing GVHD. Collectively, the data suggest the possibility that pre-transplant metabolic signature may be used for risk stratification of SCT recipients with respect to development of alloreactivity

    Targeting the hematopoietic stem cell antigen FLT3 by high-affinity T cell receptor for the treatment of high-risk acute myeloid leukemia

    Get PDF
    Acute myeloid leukemia (AML) is a disease with poor prognosis. Fsm-like tyrosine kinase 3 (FLT3) is a promising target because of its overexpression in AML cells. Efforts have been put to develop new therapeutics targeting FLT3 by small molecule inhibitors and most recently with chimeric antigen receptor (CAR) modified T cells. We generated HLA-A2-restricted, FLT3-specific T cell receptors (TCR) to target FLT3-positive AML and hematopoietic stem cells (HSCs) in an HLA-A2-mismatched allogeneic-HSC transplantation. In our proposed set up, FLT3-specific TCRs would eliminate AML cells as well as HLA-A2-positive HSCs of the patient allowing engraftment of a healthy, HLA-A2-negative hematopoietic system. FLT3 is a self-antigen, therefore, T cells bearing high-affinity TCRs against epitopes derived from it are deleted in the thymus during T cell development. To circumvent the tolerance, we immunized a transgenic mouse model expressing a diverse human TCR repertoire and HLA-A2 molecule (ABabDII). The candidate epitopes for immunizations, FLT3839 and FLT3986, were selected among in silico predicted epitopes based on their binding affinity to HLA-A2 and homology to the mouse FLT3. We identified one TCR against FLT3839 (6546-IMS) and two TCRs against FLT3986 (6780-GLL and 6782-GLL). IFN- release was detected only from 6782-GLL T cells after overnight co-culture with a K562 cell line that was modified to express high levels of FLT3 and HLA-A2 proving FLT3986 epitope is naturally processed and presented. We tested the FLT3986-specific TCRs on three different cell lines that express FLT3 endogenously. We did not detect any CD137 upregulation by FACS or IFN- release by ELISA from neither of the FLT3986-specific TCRs against an AML cell line THP1. On the other hand, co-culture with SEM and MV-4;11 cell lines that express FLT3 endogenously and were modified to express HLA-A2 molecule, and with THP1 cells modified to overexpress FLT3 induced CD137 upregulation only on 6780-GLL T cells, but did not trigger any IFN- secretion suggesting higher FLT3 availability might be required for target cell recognition by the 6780-GLL TCR. This could be due to i) the sub-optimal avidities of the identified TCRs to the pMHC complex ii) low binding affinity of FLT3986 epitope to HLA-A2 molecule resulting in a poor presentation on the cell surface. In addition, recognition of MV-4;11 cells which carry the FLT3-ITD mutation suggested FLT3986 epitope is produced from both the wild type and mutated FLT3. During the in vitro safety testing, we discovered high, intracellular FLT3 expression in the Purkinje cells of the human cerebellum. We have stopped our attempt to identify high-affinity FLT3-specific TCRs due to potential cerebellar toxicity. We believe FLT3 could still be a safe, valuable target for therapies other than TCR-modified T cells

    Acute Myeloid Leukemia

    Get PDF
    Acute myeloid leukemia (AML) is the most common type of leukemia. The Cancer Genome Atlas Research Network has demonstrated the increasing genomic complexity of acute myeloid leukemia (AML). In addition, the network has facilitated our understanding of the molecular events leading to this deadly form of malignancy for which the prognosis has not improved over past decades. AML is a highly heterogeneous disease, and cytogenetics and molecular analysis of the various chromosome aberrations including deletions, duplications, aneuploidy, balanced reciprocal translocations and fusion of transcription factor genes and tyrosine kinases has led to better understanding and identification of subgroups of AML with different prognoses. Furthermore, molecular classification based on mRNA expression profiling has facilitated identification of novel subclasses and defined high-, poor-risk AML based on specific molecular signatures. However, despite increased understanding of AML genetics, the outcome for AML patients whose number is likely to rise as the population ages, has not changed significantly. Until it does, further investigation of the genomic complexity of the disease and advances in drug development are needed. In this review, leading AML clinicians and research investigators provide an up-to-date understanding of the molecular biology of the disease addressing advances in diagnosis, classification, prognostication and therapeutic strategies that may have significant promise and impact on overall patient survival

    Role of machine learning in early diagnosis of kidney diseases.

    Get PDF
    Machine learning (ML) and deep learning (DL) approaches have been used as indispensable tools in modern artificial intelligence-based computer-aided diagnostic (AIbased CAD) systems that can provide non-invasive, early, and accurate diagnosis of a given medical condition. These AI-based CAD systems have proven themselves to be reproducible and have the generalization ability to diagnose new unseen cases with several diseases and medical conditions in different organs (e.g., kidneys, prostate, brain, liver, lung, breast, and bladder). In this dissertation, we will focus on the role of such AI-based CAD systems in early diagnosis of two kidney diseases, namely: acute rejection (AR) post kidney transplantation and renal cancer (RC). A new renal computer-assisted diagnostic (Renal-CAD) system was developed to precisely diagnose AR post kidney transplantation at an early stage. The developed Renal-CAD system perform the following main steps: (1) auto-segmentation of the renal allograft from surrounding tissues from diffusion weighted magnetic resonance imaging (DW-MRI) and blood oxygen level-dependent MRI (BOLD-MRI), (2) extraction of image markers, namely: voxel-wise apparent diffusion coefficients (ADCs) are calculated from DW-MRI scans at 11 different low and high b-values and then represented as cumulative distribution functions (CDFs) and extraction of the transverse relaxation rate (R2*) values from the segmented kidneys using BOLD-MRI scans at different echotimes, (3) integration of multimodal image markers with the associated clinical biomarkers, serum creatinine (SCr) and creatinine clearance (CrCl), and (4) diagnosing renal allograft status as nonrejection (NR) or AR by utilizing these integrated biomarkers and the developed deep learning classification model built on stacked auto-encoders (SAEs). Using a leaveone- subject-out cross-validation approach along with SAEs on a total of 30 patients with transplanted kidney (AR = 10 and NR = 20), the Renal-CAD system demonstrated 93.3% accuracy, 90.0% sensitivity, and 95.0% specificity in differentiating AR from NR. Robustness of the Renal-CAD system was also confirmed by the area under the curve value of 0.92. Using a stratified 10-fold cross-validation approach, the Renal-CAD system demonstrated its reproduciblity and robustness with a diagnostic accuracy of 86.7%, sensitivity of 80.0%, specificity of 90.0%, and AUC of 0.88. In addition, a new renal cancer CAD (RC-CAD) system for precise diagnosis of RC at an early stage was developed, which incorporates the following main steps: (1) estimating the morphological features by applying a new parametric spherical harmonic technique, (2) extracting appearance-based features, namely: first order textural features are calculated and second order textural features are extracted after constructing the graylevel co-occurrence matrix (GLCM), (3) estimating the functional features by constructing wash-in/wash-out slopes to quantify the enhancement variations across different contrast enhanced computed tomography (CE-CT) phases, (4) integrating all the aforementioned features and modeling a two-stage multilayer perceptron artificial neural network (MLPANN) classifier to classify the renal tumor as benign or malignant and identify the malignancy subtype. On a total of 140 RC patients (malignant = 70 patients (ccRCC = 40 and nccRCC = 30) and benign angiomyolipoma tumors = 70), the developed RC-CAD system was validated using a leave-one-subject-out cross-validation approach. The developed RC-CAD system achieved a sensitivity of 95.3% ± 2.0%, a specificity of 99.9% ± 0.4%, and Dice similarity coefficient of 0.98 ± 0.01 in differentiating malignant from benign renal tumors, as well as an overall accuracy of 89.6% ± 5.0% in the sub-typing of RCC. The diagnostic abilities of the developed RC-CAD system were further validated using a randomly stratified 10-fold cross-validation approach. The results obtained using the proposed MLP-ANN classification model outperformed other machine learning classifiers (e.g., support vector machine, random forests, and relational functional gradient boosting) as well as other different approaches from the literature. In summary, machine and deep learning approaches have shown potential abilities to be utilized to build AI-based CAD systems. This is evidenced by the promising diagnostic performance obtained by both Renal-CAD and RC-CAD systems. For the Renal- CAD, the integration of functional markers extracted from multimodal MRIs with clinical biomarkers using SAEs classification model, potentially improved the final diagnostic results evidenced by high accuracy, sensitivity, and specificity. The developed Renal-CAD demonstrated high feasibility and efficacy for early, accurate, and non-invasive identification of AR. For the RC-CAD, integrating morphological, textural, and functional features extracted from CE-CT images using a MLP-ANN classification model eventually enhanced the final results in terms of accuracy, sensitivity, and specificity, making the proposed RC-CAD a reliable noninvasive diagnostic tool for RC. The early and accurate diagnosis of AR or RC will help physicians to provide early intervention with the appropriate treatment plan to prolong the life span of the diseased kidney, increase the survival chance of the patient, and thus improve the healthcare outcome in the U.S. and worldwide

    Development and application of deep learning and spatial statistics within 3D bone marrow imaging

    Get PDF
    The bone marrow is a highly specialised organ, responsible for the formation of blood cells. Despite 50 years of research, the spatial organisation of the bone marrow remains an area full of controversy and contradiction. One reason for this is that imaging of bone marrow tissue is notoriously difficult. Secondly, efficient methodologies to fully extract and analyse large datasets remain the Achilles heels of imaging-based research. In this thesis I present a pipeline for generating 3D bone marrow images followed by the large-scale data extraction and spatial statistical analysis of the resulting data. Using these techniques, in the context of 3D imaging, I am able to identify and classify the location of hundreds of thousands of cells within various bone marrow samples. I then introduce a series of statistical techniques tailored to work with spatial data, resulting in a 3D statistical map of the tissue from which multi-cellular interactions can be clearly understood. As an illustration of the power of this new approach, I apply this pipeline to diseased samples of bone marrow with a particular focus on leukaemia and its interactions with CD8+ T cells. In so doing I show that this novel pipeline can be used to unravel complex multi-cellular interactions and assist researchers in understanding the processes taking place within the bone marrow.Open Acces

    Methods for Predicting an Ordinal Response with High-Throughput Genomic Data

    Get PDF
    Multigenic diagnostic and prognostic tools can be derived for ordinal clinical outcomes using data from high-throughput genomic experiments. A challenge in this setting is that the number of predictors is much greater than the sample size, so traditional ordinal response modeling techniques must be exchanged for more specialized approaches. Existing methods perform well on some datasets, but there is room for improvement in terms of variable selection and predictive accuracy. Therefore, we extended an impressive binary response modeling technique, Feature Augmentation via Nonparametrics and Selection, to the ordinal response setting. Through simulation studies and analyses of high-throughput genomic datasets, we showed that our Ordinal FANS method is sensitive and specific when discriminating between important and unimportant features from the high-dimensional feature space and is highly competitive in terms of predictive accuracy. Discrete survival time is another example of an ordinal response. For many illnesses and chronic conditions, it is impossible to record the precise date and time of disease onset or relapse. Further, the HIPPA Privacy Rule prevents recording of protected health information which includes all elements of dates (except year), so in the absence of a “limited dataset,” date of diagnosis or date of death are not available for calculating overall survival. Thus, we developed a method that is suitable for modeling high-dimensional discrete survival time data and assessed its performance by conducting a simulation study and by predicting the discrete survival times of acute myeloid leukemia patients using a high-dimensional dataset

    Methods for Predicting an Ordinal Response with High-Throughput Genomic Data

    Get PDF
    Multigenic diagnostic and prognostic tools can be derived for ordinal clinical outcomes using data from high-throughput genomic experiments. A challenge in this setting is that the number of predictors is much greater than the sample size, so traditional ordinal response modeling techniques must be exchanged for more specialized approaches. Existing methods perform well on some datasets, but there is room for improvement in terms of variable selection and predictive accuracy. Therefore, we extended an impressive binary response modeling technique, Feature Augmentation via Nonparametrics and Selection, to the ordinal response setting. Through simulation studies and analyses of high-throughput genomic datasets, we showed that our Ordinal FANS method is sensitive and specific when discriminating between important and unimportant features from the high-dimensional feature space and is highly competitive in terms of predictive accuracy. Discrete survival time is another example of an ordinal response. For many illnesses and chronic conditions, it is impossible to record the precise date and time of disease onset or relapse. Further, the HIPPA Privacy Rule prevents recording of protected health information which includes all elements of dates (except year), so in the absence of a “limited dataset,” date of diagnosis or date of death are not available for calculating overall survival. Thus, we developed a method that is suitable for modeling high-dimensional discrete survival time data and assessed its performance by conducting a simulation study and by predicting the discrete survival times of acute myeloid leukemia patients using a high-dimensional dataset

    Analysis and Design of Detection for Liver Cancer using Particle Swarm Optimization and Decision Tree

    Get PDF
    Liver cancer is taken as a major cause of death all over the world. According to WHO (World Health Organization) every year 9.6 million peoples are died due to cancer worldwide. It is one of the eighth most leading causes of death in women and fifth in men as reported by the American Cancer Society. The number of death rate due to cancer is projected to increase by45 percent in between 2008 to 2030. The most common cancers are lung, breast, and liver, colorectal. Approximately 7, 82,000 peoples are died due to liver cancer each year. The most efficient way to decrease the death rate cause of liver cancer is to treat the diseases in the initial stage. Early treatment depends upon the early diagnosis, which depends on reliable diagnosis methods. CT imaging is one of the most common and important technique and it acts as an imaging tool for evaluating the patients with intuition of liver cancer. The diagnosis of liver cancer has historically been made manually by a skilled radiologist, who relied on their expertise and personal judgement to reach a conclusion. The main objective of this paper is to develop the automatic methods based on machine learning approach for accurate detection of liver cancer in order to help radiologists in the clinical practice. The paper primary contribution to the process of liver cancer lesion classification and automatic detection for clinical diagnosis. For the purpose of detecting liver cancer lesions, the best approaches based on PSO and DPSO have been given. With the help of the C4.5 decision tree classifier, wavelet-based statistical and morphological features were retrieved and categorised
    • …
    corecore