5 research outputs found

    Approximate Data Mining Techniques on Clinical Data

    Get PDF
    The past two decades have witnessed an explosion in the number of medical and healthcare datasets available to researchers and healthcare professionals. Data collection efforts are highly required, and this prompts the development of appropriate data mining techniques and tools that can automatically extract relevant information from data. Consequently, they provide insights into various clinical behaviors or processes captured by the data. Since these tools should support decision-making activities of medical experts, all the extracted information must be represented in a human-friendly way, that is, in a concise and easy-to-understand form. To this purpose, here we propose a new framework that collects different new mining techniques and tools proposed. These techniques mainly focus on two aspects: the temporal one and the predictive one. All of these techniques were then applied to clinical data and, in particular, ICU data from MIMIC III database. It showed the flexibility of the framework, which is able to retrieve different outcomes from the overall dataset. The first two techniques rely on the concept of Approximate Temporal Functional Dependencies (ATFDs). ATFDs have been proposed, with their suitable treatment of temporal information, as a methodological tool for mining clinical data. An example of the knowledge derivable through dependencies may be "within 15 days, patients with the same diagnosis and the same therapy usually receive the same daily amount of drug". However, current ATFD models are not analyzing the temporal evolution of the data, such as "For most patients with the same diagnosis, the same drug is prescribed after the same symptom". To this extent, we propose a new kind of ATFD called Approximate Pure Temporally Evolving Functional Dependencies (APEFDs). Another limitation of such kind of dependencies is that they cannot deal with quantitative data when some tolerance can be allowed for numerical values. In particular, this limitation arises in clinical data warehouses, where analysis and mining have to consider one or more measures related to quantitative data (such as lab test results and vital signs), concerning multiple dimensional (alphanumeric) attributes (such as patient, hospital, physician, diagnosis) and some time dimensions (such as the day since hospitalization and the calendar date). According to this scenario, we introduce a new kind of ATFD, named Multi-Approximate Temporal Functional Dependency (MATFD), which considers dependencies between dimensions and quantitative measures from temporal clinical data. These new dependencies may provide new knowledge as "within 15 days, patients with the same diagnosis and the same therapy receive a daily amount of drug within a fixed range". The other techniques are based on pattern mining, which has also been proposed as a methodological tool for mining clinical data. However, many methods proposed so far focus on mining of temporal rules which describe relationships between data sequences or instantaneous events, without considering the presence of more complex temporal patterns into the dataset. These patterns, such as trends of a particular vital sign, are often very relevant for clinicians. Moreover, it is really interesting to discover if some sort of event, such as a drug administration, is capable of changing these trends and how. To this extent, we propose a new kind of temporal patterns, called Trend-Event Patterns (TEPs), that focuses on events and their influence on trends that can be retrieved from some measures, such as vital signs. With TEPs we can express concepts such as "The administration of paracetamol on a patient with an increasing temperature leads to a decreasing trend in temperature after such administration occurs". We also decided to analyze another interesting pattern mining technique that includes prediction. This technique discovers a compact set of patterns that aim to describe the condition (or class) of interest. Our framework relies on a classification model that considers and combines various predictive pattern candidates and selects only those that are important to improve the overall class prediction performance. We show that our classification approach achieves a significant reduction in the number of extracted patterns, compared to the state-of-the-art methods based on minimum predictive pattern mining approach, while preserving the overall classification accuracy of the model. For each technique described above, we developed a tool to retrieve its kind of rule. All the results are obtained by pre-processing and mining clinical data and, as mentioned before, in particular ICU data from MIMIC III database

    Analyzing Patient Trajectories With Artificial Intelligence

    Full text link
    In digital medicine, patient data typically record health events over time (eg, through electronic health records, wearables, or other sensing technologies) and thus form unique patient trajectories. Patient trajectories are highly predictive of the future course of diseases and therefore facilitate effective care. However, digital medicine often uses only limited patient data, consisting of health events from only a single or small number of time points while ignoring additional information encoded in patient trajectories. To analyze such rich longitudinal data, new artificial intelligence (AI) solutions are needed. In this paper, we provide an overview of the recent efforts to develop trajectory-aware AI solutions and provide suggestions for future directions. Specifically, we examine the implications for developing disease models from patient trajectories along the typical workflow in AI: problem definition, data processing, modeling, evaluation, and interpretation. We conclude with a discussion of how such AI solutions will allow the field to build robust models for personalized risk scoring, subtyping, and disease pathway discovery

    Discovering Evolving Temporal Information: Theory and Application to Clinical Databases

    Get PDF
    Functional dependencies (FDs) allow us to represent database constraints, corresponding to requirements as \u201cpatients having the same symptoms undergo the same medical tests.\u201d Some research eforts have focused on extending such dependencies to consider also temporal constraints such as \u201cpatients having the same symptoms undergo in the next period the same medical tests.\u201d Temporal functional dependencies are able to represent such kind of temporal constraints in relational databases. Another extension for FDs allows one to represent approximate functional dependencies (AFDs), as \u201cpatients with the same symptoms generally undergo the same medical tests.\u201d It enables data to deviate from the defned constraints according to a user-defned percentage. Approximate temporal functional dependencies (ATFDs) merge the concepts of temporal functional dependency and of approximate functional dependency. Among the diferent kinds of ATFD, the Approximate Pure Temporally Evolving Functional Dependencies (APE-FDs for short) allow one to detect patterns on the evolution of data in the database and to discover dependencies as \u201cFor most patients with the same initial diagnosis, the same medical test is prescribed after the occurrence of same symptom.\u201d Mining ATFDs from large databases may be computationally expensive. In this paper, we focus on APE-FDs and prove that, unfortunately, verifying a single APE-FD over a given database instance is in general NP-complete. In order to cope with this problem, we propose a framework for mining complex APE-FDs in real-world data collections. In the framework, we designed and applied sound and advanced model-checking techniques. To prove the feasibility of our proposal, we used real-world databases from two medical domains (namely, psychiatry and pharmacovigilance) and tested the running prototype we developed on such databases

    Discovering Quantitative Temporal Functional Dependencies on Clinical Data

    No full text
    Approximate functional dependencies, even with suitable temporal extensions, have been recently proposed as a methodological tool for mining clinical data. It allows healthcare stakeholders to derive new knowledge from overwhelming amount of healthcare and clinical data. Some examples of the kind of knowledge derivable from data through dependencies may be 'month by month, patients with the same symptoms get the same type of therapy' or 'within 15 days, patients with the same diagnosis and the same therapy receive the same daily amount of drug'. The main limitation of such kind of dependencies is that they cannot deal with quantitative data, when some tolerance can be allowed for numerical values. In particular, such limitation arises in clinical data warehouses, where analysis and mining have to consider one or more measures (related to quantitative data as lab test results, vital signs as blood pressures, temperature and so on), with respect to many dimensional (alphanumeric) attributes (as patient, hospital, physician, diagnosis) and to some time dimensions (as the day since hospitalization, the calendar date, and so on). According to this scenario, we introduce here a new kind of approximate temporal functional dependency, named multi approximate temporal functional dependency (MATFD), which consider dependencies between dimensions and quantitative measures from temporal clinical data. Such new dependencies may provide new knowledge as 'within 15 days, patients with the same diagnosis and the same therapy receive a daily amount of drug within a fixed range'. Moreover, we provide an original algorithm to mine such kind of dependencies and to derive some core dependencies, both for the discovered temporal window and for the involved dimensional attributes. Finally, we discuss some first results we obtained by pre-processing and mining ICU data from MIMIC III database
    corecore