2,745 research outputs found

    Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers

    Get PDF
    As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse domain of patient outcomes. Patient records have continually become more widely available through various initiatives enabling open access whilst maintaining critical patient privacy. In spite of such progress, health records remain not widely adopted within the current clinical statistical analysis domain due to challenging issues derived from such “big data”.Deep learning based temporal modelling approaches present an ideal solution to health record challenges through automated self-optimisation of representation learning, able to man-ageably compose the high-dimensional domain of patient records into data representations able to model complex data associations. Such representations can serve to condense and reduce dimensionality to emphasise feature sparsity and importance through novel embedded feature selection approaches. Accordingly, application towards patient records enable complex mod-elling and analysis of the full domain of clinical features to select biomarkers of predictive relevance.Firstly, we propose a novel entropy regularised neural network ensemble able to highlight risk factors associated with hospitalisation risk of individuals with dementia. The application of which, was able to reduce a large domain of unique medical events to a small set of relevant risk factors able to maintain hospitalisation discrimination.Following on, we continue our work on ensemble architecture approaches with a novel cas-cading LSTM ensembles to predict severe sepsis onset within critical patients in an ICU critical care centre. We demonstrate state-of-the-art performance capabilities able to outperform that of current related literature.Finally, we propose a novel embedded feature selection application dubbed 1D convolu-tion feature selection using sparsity regularisation. Said methodology was evaluated on both domains of dementia and sepsis prediction objectives to highlight model capability and generalisability. We further report a selection of potential biomarkers for the aforementioned case study objectives highlighting clinical relevance and potential novelty value for future clinical analysis.Accordingly, we demonstrate the effective capability of embedded feature selection ap-proaches through the application of temporal based deep learning architectures in the discovery of effective biomarkers across a variety of challenging clinical applications

    VIVA: An Online Algorithm for Piecewise Curve Estimation Using ℓ\u3csup\u3e0\u3c/sup\u3e Norm Regularization

    Get PDF
    Many processes deal with piecewise input functions, which occur naturally as a result of digital commands, user interfaces requiring a confirmation action, or discrete-time sampling. Examples include the assembly of protein polymers and hourly adjustments to the infusion rate of IV fluids during treatment of burn victims. Estimation of the input is straightforward regression when the observer has access to the timing information. More work is needed if the input can change at unknown times. Successful recovery of the change timing is largely dependent on the choice of cost function minimized during parameter estimation. Optimal estimation of a piecewise input will often proceed by minimization of a cost function which includes an estimation error term (most commonly mean square error) and the number (cardinality) of input changes (number of commands). Because the cardinality (ℓ0 norm) is not convex, the ℓ2 norm (quadratic smoothing) and ℓ1 norm (total variation minimization) are often substituted because they permit the use of convex optimization algorithms. However, these penalize the magnitude of input changes and therefore bias the piecewise estimates. Another disadvantage is that global optimization methods must be run after the end of data collection. One approach to unbiasing the piecewise parameter fits would include application of total variation minimization to recover timing, followed by piecewise parameter fitting. Another method is presented herein: a dynamic programming approach which iteratively develops populations of candidate estimates of increasing length, pruning those proven to be dominated. Because the usage of input data is entirely causal, the algorithm recovers timing and parameter values online. A functional definition of the algorithm, which is an extension of Viterbi decoding and integrates the pruning concept from branch-and-bound, is presented. Modifications are introduced to improve handling of non-uniform sampling, non-uniform confidence, and burst errors. Performance tests using synthesized data sets as well as volume data from a research system recording fluid infusions show five-fold (piecewise-constant data) and 20-fold (piecewise-linear data) reduction in error compared to total variation minimization, along with improved sparsity and reduced sensitivity to the regularization parameter. Algorithmic complexity and delay are also considered

    A Review of Bias and Fairness in Artificial Intelligence

    Get PDF
    Automating decision systems has led to hidden biases in the use of artificial intelligence (AI). Consequently, explaining these decisions and identifying responsibilities has become a challenge. As a result, a new field of research on algorithmic fairness has emerged. In this area, detecting biases and mitigating them is essential to ensure fair and discrimination-free decisions. This paper contributes with: (1) a categorization of biases and how these are associated with different phases of an AI model’s development (including the data-generation phase); (2) a revision of fairness metrics to audit the data and AI models trained with them (considering agnostic models when focusing on fairness); and, (3) a novel taxonomy of the procedures to mitigate biases in the different phases of an AI model’s development (pre-processing, training, and post-processing) with the addition of transversal actions that help to produce fairer models

    A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning

    Full text link
    It is commonplace to use data containing personal information to build predictive models in the framework of empirical risk minimization (ERM). While these models can be highly accurate in prediction, results obtained from these models with the use of sensitive data may be susceptible to privacy attacks. Differential privacy (DP) is an appealing framework for addressing such data privacy issues by providing mathematically provable bounds on the privacy loss incurred when releasing information from sensitive data. Previous work has primarily concentrated on applying DP to unweighted ERM. We consider an important generalization to weighted ERM (wERM). In wERM, each individual's contribution to the objective function can be assigned varying weights. In this context, we propose the first differentially private wERM algorithm, backed by a rigorous theoretical proof of its DP guarantees under mild regularity conditions. Extending the existing DP-ERM procedures to wERM paves a path to deriving privacy-preserving learning methods for individualized treatment rules, including the popular outcome weighted learning (OWL). We evaluate the performance of the DP-wERM application to OWL in a simulation study and in a real clinical trial of melatonin for sleep health. All empirical results demonstrate the viability of training OWL models via wERM with DP guarantees while maintaining sufficiently useful model performance. Therefore, we recommend practitioners consider implementing the proposed privacy-preserving OWL procedure in real-world scenarios involving sensitive data.Comment: 24 pages and 2 figures for the main manuscript, 5 pages and 2 figures for the supplementary material
    corecore