12 research outputs found

    FIBS: A Generic Framework for Classifying Interval-based Temporal Sequences

    Full text link
    We study the problem of classifying interval-based temporal sequences (IBTSs). Since common classification algorithms cannot be directly applied to IBTSs, the main challenge is to define a set of features that effectively represents the data such that classifiers can be applied. Most prior work utilizes frequent pattern mining to define a feature set based on discovered patterns. However, frequent pattern mining is computationally expensive and often discovers many irrelevant patterns. To address this shortcoming, we propose the FIBS framework for classifying IBTSs. FIBS extracts features relevant to classification from IBTSs based on relative frequency and temporal relations. To avoid selecting irrelevant features, a filter-based selection strategy is incorporated into FIBS. Our empirical evaluation on eight real-world datasets demonstrates the effectiveness of our methods in practice. The results provide evidence that FIBS effectively represents IBTSs for classification algorithms, which contributes to similar or significantly better accuracy compared to state-of-the-art competitors. It also suggests that the feature selection strategy is beneficial to FIBS's performance.Comment: In: Big Data Analytics and Knowledge Discovery. DaWaK 2020. Springer, Cha

    Analysis and Visualization Methods for Data-Driven Longitudinal Patient Summary

    Get PDF
    Digitization of health records has opened avenues for intensive research in the fields of health informatics. Power of machine learning, statistical analysis and visual analytics could be utilized to make optimal use of this information. The proposed project is to develop an interactive visualization tool that summarizes a patient's medical history, highlighting all his/her important events based on the knowledge of similar patients. Given a set of patients with common conditions, statistical analysis can be used to develop models that prioritize features based on associations between features and condition-specific outcome measures. This manuscript in particular describes the model developed to prioritize a patient's events from his medical history. The model is trained with the population of patients and their events. Their correlations with the outcome variable are calculated to identify the important events in a specific cohort. This correlation score can be used to prioritize the events associated with an individual patient. This model is one of the models that will be used to summarize an individual patient's medical data via interactive visualization methods.Master of Science in Information Scienc

    Mining Predictive Patterns and Extension to Multivariate Temporal Data

    Get PDF
    An important goal of knowledge discovery is the search for patterns in the data that can help explaining its underlying structure. To be practically useful, the discovered patterns should be novel (unexpected) and easy to understand by humans. In this thesis, we study the problem of mining patterns (defining subpopulations of data instances) that are important for predicting and explaining a specific outcome variable. An example is the task of identifying groups of patients that respond better to a certain treatment than the rest of the patients. We propose and present efficient methods for mining predictive patterns for both atemporal and temporal (time series) data. Our first method relies on frequent pattern mining to explore the search space. It applies a novel evaluation technique for extracting a small set of frequent patterns that are highly predictive and have low redundancy. We show the benefits of this method on several synthetic and public datasets. Our temporal pattern mining method works on complex multivariate temporal data, such as electronic health records, for the event detection task. It first converts time series into time-interval sequences of temporal abstractions and then mines temporal patterns backwards in time, starting from patterns related to the most recent observations. We show the benefits of our temporal pattern mining method on two real-world clinical tasks
    corecore