44,183 research outputs found
Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility Framework
Mining high utility sequential patterns is observed to be a significant research in data mining. Several methods mine the sequential patterns while taking utility values into consideration. The patterns of this type can determine the order in which items were purchased, but not the time interval between them. The time interval among items is important for predicting the most useful real-world circumstances, including retail market basket data analysis, stock market fluctuations, DNA sequence analysis, and so on. There are a very few algorithms for mining sequential patterns those consider both the utility and time interval. However, they assume the same threshold for each item, maintaining the same unit profit. Moreover, with the rapid growth in data, the traditional algorithms cannot handle the big data and are not scalable. To handle this problem, we propose a distributed three phase MapReduce framework that considers multiple utilities and suitable for handling big data. The time constraints are pushed into the algorithm instead of pre-defined intervals. Also, the proposed upper bound minimizes the number of candidate patterns during the mining process. The approach has been tested and the experimental results show its efficiency in terms of run time, memory utilization, and scalability
Can we Take Advantage of Time-Interval Pattern Mining to Model Students Activity?
International audienceAnalyzing students' activities in their learning process is an issue that has received significant attention in the educational data mining research field. Many approaches have been proposed, including the popular sequential pattern mining. However, the vast majority of the works do not focus on the time of occurrence of the events within the activities. This paper relies on the hypothesis that we can get a better understanding of students' activities, as well as design more accurate models, if time is considered. With this in mind, we propose to study time-interval patterns. To highlight the benefits of managing time, we analyze the data collected about 113 first-year university students interacting with their LMS. Experiments reveal that frequent time-interval patterns are actually identified, which means that some students' activities are regulated not only by the order of learning resources but also by time. In addition, the experiments emphasize that the sets of intervals highly influence the patterns mined and that the set of intervals that represents the human natural time (minute, hour, day, etc.) seems to be the most appropriate one to represent time gap between resources. Finally, we show that time-interval pattern mining brings additional information compared to sequential pattern mining. Indeed, not only the view of students' possible future activities is less uncertain (in terms of learning resources and their temporal gap) but also, as soon as two students differ in their time-intervals, this di↵erence indicates that their following activities are likely to diverge
Feature-based time-series analysis
This work presents an introduction to feature-based time-series analysis. The
time series as a data type is first described, along with an overview of the
interdisciplinary time-series analysis literature. I then summarize the range
of feature-based representations for time series that have been developed to
aid interpretable insights into time-series structure. Particular emphasis is
given to emerging research that facilitates wide comparison of feature-based
representations that allow us to understand the properties of a time-series
dataset that make it suited to a particular feature-based representation or
analysis algorithm. The future of time-series analysis is likely to embrace
approaches that exploit machine learning methods to partially automate human
learning to aid understanding of the complex dynamical patterns in the time
series we measure from the world.Comment: 28 pages, 9 figure
Constraining the Search Space in Temporal Pattern Mining
Agents in dynamic environments have to deal with complex situations including various temporal interrelations of actions and events. Discovering frequent patterns in such scenes can be useful in order to create prediction rules which can be used to predict future activities or situations. We present the algorithm MiTemP which learns frequent patterns based on a time intervalbased relational representation. Additionally the problem has also been transfered to a pure relational association rule mining task which can be handled by WARMR. The two approaches are compared in a number of experiments. The experiments show the advantage of avoiding the creation of impossible or redundant patterns with MiTemP. While less patterns have to be explored on average with MiTemP more frequent patterns are found at an earlier refinement level
Exploring the Evolution of Node Neighborhoods in Dynamic Networks
Dynamic Networks are a popular way of modeling and studying the behavior of
evolving systems. However, their analysis constitutes a relatively recent
subfield of Network Science, and the number of available tools is consequently
much smaller than for static networks. In this work, we propose a method
specifically designed to take advantage of the longitudinal nature of dynamic
networks. It characterizes each individual node by studying the evolution of
its direct neighborhood, based on the assumption that the way this neighborhood
changes reflects the role and position of the node in the whole network. For
this purpose, we define the concept of \textit{neighborhood event}, which
corresponds to the various transformations such groups of nodes can undergo,
and describe an algorithm for detecting such events. We demonstrate the
interest of our method on three real-world networks: DBLP, LastFM and Enron. We
apply frequent pattern mining to extract meaningful information from temporal
sequences of neighborhood events. This results in the identification of
behavioral trends emerging in the whole network, as well as the individual
characterization of specific nodes. We also perform a cluster analysis, which
reveals that, in all three networks, one can distinguish two types of nodes
exhibiting different behaviors: a very small group of active nodes, whose
neighborhood undergo diverse and frequent events, and a very large group of
stable nodes
Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare
For the last years, time-series mining has become a challenging issue for
researchers. An important application lies in most monitoring purposes, which
require analyzing large sets of time-series for learning usual patterns. Any
deviation from this learned profile is then considered as an unexpected
situation. Moreover, complex applications may involve the temporal study of
several heterogeneous parameters. In that paper, we propose a method for mining
heterogeneous multivariate time-series for learning meaningful patterns. The
proposed approach allows for mixed time-series -- containing both pattern and
non-pattern data -- such as for imprecise matches, outliers, stretching and
global translating of patterns instances in time. We present the early results
of our approach in the context of monitoring the health status of a person at
home. The purpose is to build a behavioral profile of a person by analyzing the
time variations of several quantitative or qualitative parameters recorded
through a provision of sensors installed in the home
- …