174,012 research outputs found

    Temporal Pattern Discovery for Accurate Sepsis Diagnosis in ICU Patients

    Full text link
    Sepsis is a condition caused by the body's overwhelming and life-threatening response to infection, which can lead to tissue damage, organ failure, and finally death. Common signs and symptoms include fever, increased heart rate, increased breathing rate, and confusion. Sepsis is difficult to predict, diagnose, and treat. Patients who develop sepsis have an increased risk of complications and death and face higher health care costs and longer hospitalization. Today, sepsis is one of the leading causes of mortality among populations in intensive care units (ICUs). In this paper, we look at the problem of early detection of sepsis by using temporal data mining. We focus on the use of knowledge-based temporal abstraction to create meaningful interval-based abstractions, and on time-interval mining to discover frequent interval-based patterns. We used 2,560 cases derived from the MIMIC-III database. We found that the distribution of the temporal patterns whose frequency is above 10% discovered in the records of septic patients during the last 6 and 12 hours before onset of sepsis is significantly different from that distribution within a similar period, during an equivalent time window during hospitalization, in the records of non-septic patients. This discovery is encouraging for the purpose of performing an early diagnosis of sepsis using the discovered patterns as constructed features

    Significant Interval and Frequent Pattern Discovery in Web Log Data

    Full text link
    There is a considerable body of work on sequence mining of Web Log Data. We are using One Pass frequent Episode discovery (or FED) algorithm, takes a different approach than the traditional apriori class of pattern detection algorithms. In this approach significant intervals for each Website are computed first (independently) and these interval used for detecting frequent patterns/Episode and then the Analysis is performed on Significant Intervals and frequent patterns That can be used to forecast the user's behavior using previous trends and this can be also used for advertising purpose. This type of applications predicts the Website interest. In this approach, time-series data are folded over a periodicity (day, week, etc.) Which are used to form the Interval? Significant intervals are discovered from these time points that satisfy the criteria of minimum confidence and maximum interval length specified by the user.Comment: International Journal of Computer Science Issues, IJCSI, Vol. 7, Issue 1, No. 3, January 2010, http://ijcsi.org/articles/Significant-Interval-and-Frequent-Pattern-Discovery-in-Web-Log-Data.ph

    Burstiness Scale: a highly parsimonious model for characterizing random series of events

    Full text link
    The problem to accurately and parsimoniously characterize random series of events (RSEs) present in the Web, such as e-mail conversations or Twitter hashtags, is not trivial. Reports found in the literature reveal two apparent conflicting visions of how RSEs should be modeled. From one side, the Poissonian processes, of which consecutive events follow each other at a relatively regular time and should not be correlated. On the other side, the self-exciting processes, which are able to generate bursts of correlated events and periods of inactivities. The existence of many and sometimes conflicting approaches to model RSEs is a consequence of the unpredictability of the aggregated dynamics of our individual and routine activities, which sometimes show simple patterns, but sometimes results in irregular rising and falling trends. In this paper we propose a highly parsimonious way to characterize general RSEs, namely the Burstiness Scale (BuSca) model. BuSca views each RSE as a mix of two independent process: a Poissonian and a self-exciting one. Here we describe a fast method to extract the two parameters of BuSca that, together, gives the burstyness scale, which represents how much of the RSE is due to bursty and viral effects. We validated our method in eight diverse and large datasets containing real random series of events seen in Twitter, Yelp, e-mail conversations, Digg, and online forums. Results showed that, even using only two parameters, BuSca is able to accurately describe RSEs seen in these diverse systems, what can leverage many applications

    Cats & Co: Categorical Time Series Coclustering

    Full text link
    We suggest a novel method of clustering and exploratory analysis of temporal event sequences data (also known as categorical time series) based on three-dimensional data grid models. A data set of temporal event sequences can be represented as a data set of three-dimensional points, each point is defined by three variables: a sequence identifier, a time value and an event value. Instantiating data grid models to the 3D-points turns the problem into 3D-coclustering. The sequences are partitioned into clusters, the time variable is discretized into intervals and the events are partitioned into clusters. The cross-product of the univariate partitions forms a multivariate partition of the representation space, i.e., a grid of cells and it also represents a nonparametric estimator of the joint distribution of the sequences, time and events dimensions. Thus, the sequences are grouped together because they have similar joint distribution of time and events, i.e., similar distribution of events along the time dimension. The best data grid is computed using a parameter-free Bayesian model selection approach. We also suggest several criteria for exploiting the resulting grid through agglomerative hierarchies, for interpreting the clusters of sequences and characterizing their components through insightful visualizations. Extensive experiments on both synthetic and real-world data sets demonstrate that data grid models are efficient, effective and discover meaningful underlying patterns of categorical time series data

    Discovering Compressing Serial Episodes from Event Sequences

    Full text link
    Most pattern mining methods output a very large number of frequent patterns and isolating a small but relevant subset is a challenging problem of current interest in frequent pattern mining. In this paper we consider discovery of a small set of relevant frequent episodes from data sequences. We make use of the Minimum Description Length principle to formulate the problem of selecting a subset of episodes. Using an interesting class of serial episodes with inter-event constraints and a novel encoding scheme for data using such episodes, we present algorithms for discovering small set of episodes that achieve good data compression. Using an example of the data streams obtained from distributed sensors in a composable coupled conveyor system, we show that our method is very effective in unearthing highly relevant episodes and that our scheme also achieves good data compression.Comment: 27 pages 3 figur

    Learning Temporal Causal Sequence Relationships from Real-Time Time-Series

    Full text link
    We aim to mine temporal causal sequences that explain observed events (consequents) in time-series traces. Causal explanations of key events in a time-series has applications in design debugging, anomaly detection, planning, root-cause analysis and many more. We make use of decision trees and interval arithmetic to mine sequences that explain defining events in the time-series. We propose modified decision tree construction metrics to handle the non-determinism introduced by the temporal dimension. The mined sequences are expressed in a readable temporal logic language that is easy to interpret. The application of the proposed methodology is illustrated through various examples.Comment: This article appears in the Journal of Artificial Intelligenc

    Water Disaggregation via Shape Features based Bayesian Discriminative Sparse Coding

    Full text link
    As the issue of freshwater shortage is increasing daily, it is critical to take effective measures for water conservation. According to previous studies, device level consumption could lead to significant freshwater conservation. Existing water disaggregation methods focus on learning the signatures for appliances; however, they are lack of the mechanism to accurately discriminate parallel appliances' consumption. In this paper, we propose a Bayesian Discriminative Sparse Coding model using Laplace Prior (BDSC-LP) to extensively enhance the disaggregation performance. To derive discriminative basis functions, shape features are presented to describe the low-sampling-rate water consumption patterns. A Gibbs sampling based inference method is designed to extend the discriminative capability of the disaggregation dictionaries. Extensive experiments were performed to validate the effectiveness of the proposed model using both real-world and synthetic datasets.Comment: 20 page

    Temporal data mining for root-cause analysis of machine faults in automotive assembly lines

    Full text link
    Engine assembly is a complex and heavily automated distributed-control process, with large amounts of faults data logged everyday. We describe an application of temporal data mining for analyzing fault logs in an engine assembly plant. Frequent episode discovery framework is a model-free method that can be used to deduce (temporal) correlations among events from the logs in an efficient manner. In addition to being theoretically elegant and computationally efficient, frequent episodes are also easy to interpret in the form actionable recommendations. Incorporation of domain-specific information is critical to successful application of the method for analyzing fault logs in the manufacturing domain. We show how domain-specific knowledge can be incorporated using heuristic rules that act as pre-filters and post-filters to frequent episode discovery. The system described here is currently being used in one of the engine assembly plants of General Motors and is planned for adaptation in other plants. To the best of our knowledge, this paper presents the first real, large-scale application of temporal data mining in the manufacturing domain. We believe that the ideas presented in this paper can help practitioners engineer tools for analysis in other similar or related application domains as well

    A framework for event co-occurrence detection in event streams

    Full text link
    This paper shows that characterizing co-occurrence between events is an important but non-trivial and neglected aspect of discovering potential causal relationships in multimedia event streams. First an introduction to the notion of event co-occurrence and its relation to co-occurrence pattern detection is given. Then a finite state automaton extended with a time model and event parameterization is introduced to convert high level co-occurrence pattern definition to its corresponding pattern matching automaton. Finally a processing algorithm is applied to count the occurrence frequency of a collection of patterns with only one pass through input event streams. The method proposed in this paper can be used for detecting co-occurrences between both events of one event stream (Auto co-occurrence), and events from multiple event streams (Cross co-occurrence). Some fundamental results concerning the characterization of event co-occurrence are presented in form of a visual co- occurrence matrix. Reusable causality rules can be extracted easily from co-occurrence matrix and fed into various analysis tools, such as recommendation systems and complex event processing systems for further analysis

    Event Identification in Social Networks

    Full text link
    Social networks enable users to freely communicate with each other and share their recent news, ongoing activities or views about different topics. As a result, they can be seen as a potentially viable source of information to understand the current emerging topics/events. The ability to model emerging topics is a substantial step to monitor and summarize the information originating from social sources. Applying traditional methods for event detection which are often proposed for processing large, formal and structured documents, are less effective, due to the short length, noisiness and informality of the social posts. Recent event detection techniques address these challenges by exploiting the opportunities behind abundant information available in social networks. This article provides an overview of the state of the art in event detection from social networks.Comment: It will appear in Encyclopedia with Semantic Computing to be published by World Scientifi
    • …
    corecore