176 research outputs found

    Nonconvex Generalization of ADMM for Nonlinear Equality Constrained Problems

    Full text link
    The ever-increasing demand for efficient and distributed optimization algorithms for large-scale data has led to the growing popularity of the Alternating Direction Method of Multipliers (ADMM). However, although the use of ADMM to solve linear equality constrained problems is well understood, we lacks a generic framework for solving problems with nonlinear equality constraints, which are common in practical applications (e.g., spherical constraints). To address this problem, we are proposing a new generic ADMM framework for handling nonlinear equality constraints, neADMM. After introducing the generalized problem formulation and the neADMM algorithm, the convergence properties of neADMM are discussed, along with its sublinear convergence rate o(1/k)o(1/k), where kk is the number of iterations. Next, two important applications of neADMM are considered and the paper concludes by describing extensive experiments on several synthetic and real-world datasets to demonstrate the convergence and effectiveness of neADMM compared to existing state-of-the-art methods

    Human-centred artificial intelligence for mobile health sensing:challenges and opportunities

    Get PDF
    Advances in wearable sensing and mobile computing have enabled the collection of health and well-being data outside of traditional laboratory and hospital settings, paving the way for a new era of mobile health. Meanwhile, artificial intelligence (AI) has made significant strides in various domains, demonstrating its potential to revolutionize healthcare. Devices can now diagnose diseases, predict heart irregularities and unlock the full potential of human cognition. However, the application of machine learning (ML) to mobile health sensing poses unique challenges due to noisy sensor measurements, high-dimensional data, sparse and irregular time series, heterogeneity in data, privacy concerns and resource constraints. Despite the recognition of the value of mobile sensing, leveraging these datasets has lagged behind other areas of ML. Furthermore, obtaining quality annotations and ground truth for such data is often expensive or impractical. While recent large-scale longitudinal studies have shown promise in leveraging wearable sensor data for health monitoring and prediction, they also introduce new challenges for data modelling. This paper explores the challenges and opportunities of human-centred AI for mobile health, focusing on key sensing modalities such as audio, location and activity tracking. We discuss the limitations of current approaches and propose potential solutions

    Twitter Mining for Syndromic Surveillance

    Get PDF
    Enormous amounts of personalised data is generated daily from social media platforms today. Twitter in particular, generates vast textual streams in real-time, accompanied with personal information. This big social media data offers a potential avenue for inferring public and social patterns. This PhD thesis investigates the use of Twitter data to deliver signals for syndromic surveillance in order to assess its ability to augment existing syndromic surveillance efforts and give a better understanding of symptomatic people who do not seek healthcare advice directly. We focus on a specific syndrome - asthma/difficulty breathing. We seek to develop means of extracting reliable signals from the Twitter signal, to be used for syndromic surveillance purposes. We begin by outlining our data collection and preprocessing methods. However, we observe that even with keyword-based data collection, many of the collected tweets are not relevant because they represent chatter, or talk of awareness instead of an individual suffering a particular condition. In light of this, we set out to identify relevant tweets to collect a strong and reliable signal. We first develop novel features based on the emoji content of Tweets and apply semi-supervised learning techniques to filter Tweets. Next, we investigate the effectiveness of deep learning at this task. We pro-pose a novel classification algorithm based on neural language models, and compare it to existing successful and popular deep learning algorithms. Following this, we go on to propose an attentive bi-directional Recurrent Neural Network architecture for filtering Tweets which also offers additional syndromic surveillance utility by identifying keywords among syndromic Tweets. In doing so, we are not only able to detect alarms, but also have some clues into what the alarm involves. Lastly, we look towards optimizing the Twitter syndromic surveillance pipeline by selecting the best possible keywords to be supplied to the Twitter API. We developed algorithms to intelligently and automatically select keywords such that the quality, in terms of relevance, and quantity of Tweets collected is maximised

    Causally-Inspired Generalizable Deep Learning Methods under Distribution Shifts

    Get PDF
    Deep learning methods achieved remarkable success in various areas of artificial intelligence, due to their powerful distribution-matching capabilities. However, these successes rely heavily on the i.i.d assumption, i.e., the data distributions in the training and test datasets are the same. In this way, current deep learning methods typically exhibit poor generalization under distribution shift, performing poorly on test data with a distribution that differs from the training data. This significantly hinders the application of deep learning methods to real-world scenarios, as the distribution of test data is not always the same as the training distribution in our rapidly evolving world. This thesis aims to discuss how to construct generalizable deep learning methods under distribution shifts. To achieve this, the thesis first models one prediction task as a structural causal model (SCM) which establishes the relationship between variables using directed acyclic graphs. In an SCM, some variables are easily changed across domains while others are not. However, deep learning methods often unintentionally mix invariant variables with easily changed variables, and thus deviate the learned model from the true one, resulting in the poor generalization ability under distribution shift. To remedy this issue, we propose specific algorithms to model such an invariant part of the SCM with deep learning methods, and experimentally show it is beneficial for the trained model to generalize well into different distributions of the same task. Last, we further propose to identify and model the variant information in the new test distribution so that we can fully adapt the trained deep learning model accordingly. We show the method can be extended for several practical applications, such as classification under label shift, image translation under semantics shift, robotics control in dynamics generalization and generalizing large language models into visual question-answer tasks

    Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers

    Get PDF
    As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse domain of patient outcomes. Patient records have continually become more widely available through various initiatives enabling open access whilst maintaining critical patient privacy. In spite of such progress, health records remain not widely adopted within the current clinical statistical analysis domain due to challenging issues derived from such “big data”.Deep learning based temporal modelling approaches present an ideal solution to health record challenges through automated self-optimisation of representation learning, able to man-ageably compose the high-dimensional domain of patient records into data representations able to model complex data associations. Such representations can serve to condense and reduce dimensionality to emphasise feature sparsity and importance through novel embedded feature selection approaches. Accordingly, application towards patient records enable complex mod-elling and analysis of the full domain of clinical features to select biomarkers of predictive relevance.Firstly, we propose a novel entropy regularised neural network ensemble able to highlight risk factors associated with hospitalisation risk of individuals with dementia. The application of which, was able to reduce a large domain of unique medical events to a small set of relevant risk factors able to maintain hospitalisation discrimination.Following on, we continue our work on ensemble architecture approaches with a novel cas-cading LSTM ensembles to predict severe sepsis onset within critical patients in an ICU critical care centre. We demonstrate state-of-the-art performance capabilities able to outperform that of current related literature.Finally, we propose a novel embedded feature selection application dubbed 1D convolu-tion feature selection using sparsity regularisation. Said methodology was evaluated on both domains of dementia and sepsis prediction objectives to highlight model capability and generalisability. We further report a selection of potential biomarkers for the aforementioned case study objectives highlighting clinical relevance and potential novelty value for future clinical analysis.Accordingly, we demonstrate the effective capability of embedded feature selection ap-proaches through the application of temporal based deep learning architectures in the discovery of effective biomarkers across a variety of challenging clinical applications

    What does explainable AI explain?

    Get PDF
    Machine Learning (ML) models are increasingly used in industry, as well as in scientific research and social contexts. Unfortunately, ML models provide only partial solutions to real-world problems, focusing on predictive performance in static environments. Problem aspects beyond prediction, such as robustness in employment, knowledge generation in science, or providing recourse recommendations to end-users, cannot be directly tackled with ML models. Explainable Artificial Intelligence (XAI) aims to solve, or at least highlight, problem aspects beyond predictive performance through explanations. However, the field is still in its infancy, as fundamental questions such as “What are explanations?”, “What constitutes a good explanation?”, or “How relate explanation and understanding?” remain open. In this dissertation, I combine philosophical conceptual analysis and mathematical formalization to clarify a prerequisite of these difficult questions, namely what XAI explains: I point out that XAI explanations are either associative or causal and either aim to explain the ML model or the modeled phenomenon. The thesis is a collection of five individual research papers that all aim to clarify how different problems in XAI are related to these different “whats”. In Paper I, my co-authors and I illustrate how to construct XAI methods for inferring associational phenomenon relationships. Paper II directly relates to the first; we formally show how to quantify uncertainty of such scientific inferences for two XAI methods – partial dependence plots (PDP) and permutation feature importance (PFI). Paper III discusses the relationship between counterfactual explanations and adversarial examples; I argue that adversarial examples can be described as counterfactual explanations that alter the prediction but not the underlying target variable. In Paper IV, my co-authors and I argue that algorithmic recourse recommendations should help data-subjects improve their qualification rather than to game the predictor. In Paper V, we address general problems with model agnostic XAI methods and identify possible solutions

    Extreme multi-label deep neural classification of Spanish health records according to the International Classification of Diseases

    Get PDF
    111 p.Este trabajo trata sobre la minería de textos clínicos, un campo del Procesamiento del Lenguaje Natural aplicado al dominio biomédico. El objetivo es automatizar la tarea de codificación médica. Los registros electrónicos de salud (EHR) son documentos que contienen información clínica sobre la salud de unpaciente. Los diagnósticos y procedimientos médicos plasmados en la Historia Clínica Electrónica están codificados con respecto a la Clasificación Internacional de Enfermedades (CIE). De hecho, la CIE es la base para identificar estadísticas de salud internacionales y el estándar para informar enfermedades y condiciones de salud. Desde la perspectiva del aprendizaje automático, el objetivo es resolver un problema extremo de clasificación de texto de múltiples etiquetas, ya que a cada registro de salud se le asignan múltiples códigos ICD de un conjunto de más de 70 000 términos de diagnóstico. Una cantidad importante de recursos se dedican a la codificación médica, una laboriosa tarea que actualmente se realiza de forma manual. Los EHR son narraciones extensas, y los codificadores médicos revisan los registros escritos por los médicos y asignan los códigos ICD correspondientes. Los textos son técnicos ya que los médicos emplean una jerga médica especializada, aunque rica en abreviaturas, acrónimos y errores ortográficos, ya que los médicos documentan los registros mientras realizan la práctica clínica real. Paraabordar la clasificación automática de registros de salud, investigamos y desarrollamos un conjunto de técnicas de clasificación de texto de aprendizaje profundo

    Social media mental health analysis framework through applied computational approaches

    Get PDF
    Studies have shown that mental illness burdens not only public health and productivity but also established market economies throughout the world. However, mental disorders are difficult to diagnose and monitor through traditional methods, which heavily rely on interviews, questionnaires and surveys, resulting in high under-diagnosis and under-treatment rates. The increasing use of online social media, such as Facebook and Twitter, is now a common part of people’s everyday life. The continuous and real-time user-generated content often reflects feelings, opinions, social status and behaviours of individuals, creating an unprecedented wealth of person-specific information. With advances in data science, social media has already been increasingly employed in population health monitoring and more recently mental health applications to understand mental disorders as well as to develop online screening and intervention tools. However, existing research efforts are still in their infancy, primarily aimed at highlighting the potential of employing social media in mental health research. The majority of work is developed on ad hoc datasets and lacks a systematic research pipeline. [Continues.]</div
    corecore