8 research outputs found

    Analyzing Patient Trajectories With Artificial Intelligence

    Full text link
    In digital medicine, patient data typically record health events over time (eg, through electronic health records, wearables, or other sensing technologies) and thus form unique patient trajectories. Patient trajectories are highly predictive of the future course of diseases and therefore facilitate effective care. However, digital medicine often uses only limited patient data, consisting of health events from only a single or small number of time points while ignoring additional information encoded in patient trajectories. To analyze such rich longitudinal data, new artificial intelligence (AI) solutions are needed. In this paper, we provide an overview of the recent efforts to develop trajectory-aware AI solutions and provide suggestions for future directions. Specifically, we examine the implications for developing disease models from patient trajectories along the typical workflow in AI: problem definition, data processing, modeling, evaluation, and interpretation. We conclude with a discussion of how such AI solutions will allow the field to build robust models for personalized risk scoring, subtyping, and disease pathway discovery

    Individualized survival prediction and surgery recommendation for patients with glioblastoma

    Get PDF
    BackgroundThere is a lack of individualized evidence on surgical choices for glioblastoma (GBM) patients.AimThis study aimed to make individualized treatment recommendations for patients with GBM and to determine the importance of demographic and tumor characteristic variables in the selection of extent of resection.MethodsWe proposed Balanced Decision Ensembles (BDE) to make survival predictions and individualized treatment recommendations. We developed several DL models to counterfactually predict the individual treatment effect (ITE) of patients with GBM. We divided the patients into the recommended (Rec.) and anti-recommended groups based on whether their actual treatment was consistent with the model recommendation.ResultsThe BDE achieved the best recommendation effects (difference in restricted mean survival time (dRMST): 5.90; 95% confidence interval (CI), 4.40–7.39; hazard ratio (HR): 0.71; 95% CI, 0.65–0.77), followed by BITES and DeepSurv. Inverse probability treatment weighting (IPTW)-adjusted HR, IPTW-adjusted OR, natural direct effect, and control direct effect demonstrated better survival outcomes of the Rec. group.ConclusionThe ITE calculation method is crucial, as it may result in better or worse recommendations. Furthermore, the significant protective effects of machine recommendations on survival time and mortality indicate the superiority of the model for application in patients with GBM. Overall, the model identifies patients with tumors located in the right and left frontal and middle temporal lobes, as well as those with larger tumor sizes, as optimal candidates for SpTR

    LocalGLMnet: interpretable deep learning for tabular data

    Get PDF
    Deep learning models have gained great popularity in statistical modeling because they lead to very competitive regression models, often outperforming classical statistical models such as generalized linear models. The disadvantage of deep learning models is that their solutions are difficult to interpret and explain, and variable selection is not easily possible because deep learning models solve feature engineering and variable selection internally in a nontransparent way. Inspired by the appealing structure of generalized linear models, we propose a new network architecture that shares similar features as generalized linear models but provides superior predictive power benefiting from the art of representation learning. This new architecture allows for variable selection of tabular data and for interpretation of the calibrated deep learning model, in fact, our approach provides an additive decomposition that can be related to other model interpretability techniques.ISSN:0346-1238ISSN:1651-203

    Industry return prediction via interpretable deep learning

    Get PDF
    We apply an interpretable machine learning model, the LassoNet, to forecast and trade U.S. industry portfolio returns. The model combines a regularization mechanism with a neural network architecture. A cooperative game-theoretic algorithm is also applied to interpret our findings. The latter hierarchizes the covariates based on their contribution to the overall model performance. Our findings reveal that the LassoNet outperforms various linear and nonlinear benchmarks concerning out-of-sample forecasting accuracy and provides economically meaningful and profitable predictions. Valuation ratios are the most crucial covariates, followed by individual and cross-industry lagged returns. The constructed industry ETF portfolios attain positive Sharpe ratios and positive and statistically significant alphas, surviving even transaction costs

    Advancing Efficiency in Neural Networks through Sparsity and Feature Selection

    Get PDF
    Deep neural networks (DNNs) have attracted considerable attention over the last several years due to their promising results in various applications. Nevertheless, their extensive model size and over-parameterization have brought to the forefront a significant challenge—escalating computational costs. Furthermore, these challenges are exacerbated when dealing with high-dimensional data, as the complexity and resource requirements of DNNs increase significantly. Consequently, the utilization of deep learning models proves to be ill-suited for scenarios characterized by constrained computational resources and limited battery life, incurring substantial training and inference costs, both in terms of memory and computational resources.Sparse neural networks (SNNs) have emerged as a prominent approach toward addressing the over-parameterization inherent in DNNs, thus, mitigating associated costs. By keeping only the most important connections of a DNN, they achieve a comparable result to their dense counterpart network but with significantly fewer parameters. However, most current solutions to reduce computation costs using SNNs mainly gain inference efficiency, while being resource-intensive during training. Furthermore, these solutions predominantly center their efforts on a restricted set of application domains, particularly within the realms of vision and language tasks.This Ph.D. research aims to address these challenges by introducing Cost-effective Artificial Neural Networks (CeANNs) designed to achieve a targeted performance across diverse complex machine learning tasks while demanding minimal computational, memory, and energy resources during both network training and inference. Our study on CeANNs includes two primary perspectives: model and data efficiency. In essence, we leverage the potential of SNNs to reduce the model parameters and data dimensionality, thereby facilitating efficient training and deployment of artificial neural networks. This work results in the development of artificial neural networks that are more practical and accessible for real-world applications, with a key emphasis on cost-effectiveness. Within this thesis, we delve into our developed methodologies aimed at advancing efficiency. Our contributions can be summarized as follows:Part I. Advancing Training and Inference Efficiency of DNNs through Sparsity.This part of the thesis focuses on enhancing the model efficiency of DNNs through sparsity. The inherent high computational cost associated with DNNs, primarily stemming from their large, over-parameterized layers, highlights the need for computationally-aware design in both model architecture and training methods. Within Part I of this thesis, we leverage sparsity to address this challenge, with a specific focus on achieving a targeted performance in extremely sparse neural networks and efficient time series analysis with DNNs. We propose two algorithms to tackle these issues: a dynamic sparse training (DST) algorithm for learning in extremely sparse neural networks (Chapter 2) and a methodology for obtaining SNNs for time series prediction (Chapter 3). In essence, our goal is to enhance the training and inference efficiency of DNNs through sparsity while focusing on addressing specific challenges in underexplored application domains, particularly in tabular and time series data analysis.Part II. Leveraging Feature Selection for Efficient Model Development. In the pursuit of cost-effective artificial neural networks, it is crucial to address the challenges associated with high-dimensional input data due to its potential to hinder scalability and introduce issues such as the curse of dimensionality and over-fitting. One promising avenue to tackle these challenges is feature selection, a technique designed to identify the most relevant and informative attributes of a dataset. However, existing feature selection methods are mostly computationally expensive, especially when dealing with high-dimensional datasets or those with a substantial sample size. To address this issue, in the second part of the thesis, we propose for the first time to exploit SNNs to perform efficient feature selection. We present our two proposed feature selection methods, one for unsupervised feature selection (Chapter 4) and another for supervised feature selection (Chapter 5). These methods are specifically designed to offer effective solutions to the challenges of high dimensionality while maintaining computational efficiency. As we show in Chapter 5, by using less than 10%10\% of the parameters of the dense network, our proposed method achieves the highest ranking-based score in terms of finding qualitative features among the state-of-the-art feature selection methods. The combination of feature selection and neural networks offers a powerful strategy, enhancing the training process and performing dimensionality reduction, thereby advancing the overall efficiency of model development.In conclusion, this research focuses on the development of cost-effective artificial neural networks that deliver targeted performance while minimizing computational, memory, and energy resources. The research explores CeANNs from two perspectives: model efficiency and data efficiency. The first part of the thesis addresses model efficiency through sparsity, proposing algorithms for efficient training and inference of DNNs for various data types. The second part of the thesis leverages SNNs to efficiently select an informative subset of attributes from high-dimensional input data. By considering both model and data efficiency, the aim is to develop CeANNs that are practical and accessible for real-world applications. In Chapter 6, we present the preliminary impact and the limitations of the work and potential directions for future research in the field. We hope that this Ph.D. thesis will pave the way to designing cost-effective artificial neural networks
    corecore