Search CORE

30 research outputs found

Mining Target-Oriented Sequential Patterns with Time-Intervals

Author: Chueh Hao-En
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 05/09/2010
Field of study

A target-oriented sequential pattern is a sequential pattern with a concerned itemset in the end of pattern. A time-interval sequential pattern is a sequential pattern with time-intervals between every pair of successive itemsets. In this paper we present an algorithm to discover target-oriented sequential pattern with time-intervals. To this end, the original sequences are reversed so that the last itemsets can be arranged in front of the sequences. The contrasts between reversed sequences and the concerned itemset are then used to exclude the irrelevant sequences. Clustering analysis is used with typical sequential pattern mining algorithm to extract the sequential patterns with time-intervals between successive itemsets. Finally, the discovered time-interval sequential patterns are reversed again to the original order for searching the target patterns.Comment: 11 pages, 9 table

arXiv.org e-Print Archive

Crossref

A model for quality guaranteed resource-aware stream mining

Author: Franke C.
Gaber M.
Karnstedt M.
Publication venue
Publication date: 17/09/2007
Field of study

Portsmouth University Research Portal (Pure)

Efficient Incremental Breadth-Depth XML Event Mining

Author: Boussaïd Omar
Darmont Jérôme
Salem Rashed
Publication venue
Publication date: 01/01/2011
Field of study

Many applications log a large amount of events continuously. Extracting interesting knowledge from logged events is an emerging active research area in data mining. In this context, we propose an approach for mining frequent events and association rules from logged events in XML format. This approach is composed of two-main phases: I) constructing a novel tree structure called Frequency XML-based Tree (FXT), which contains the frequency of events to be mined; II) querying the constructed FXT using XQuery to discover frequent itemsets and association rules. The FXT is constructed with a single-pass over logged data. We implement the proposed algorithm and study various performance issues. The performance study shows that the algorithm is efficient, for both constructing the FXT and discovering association rules

arXiv.org e-Print Archive

Crossref

HAL Descartes

HAL

Frequent Item Set Mining Using INC_MINE in Massive Online Analysis Frame Work

Author: Malini M.Patil.
Srimani P.K.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Frequent Pattern Mining is one of the major data mining techniques, which is exhaustively studied in the past decade. The technological advancements have resulted in huge data generation, having increased rate of data distribution. The generated data is called as a ‘data stream’. Data streams can be mined only by using sophisticated techniques. The paper aims at carrying out frequent pattern mining on data streams. Stream mining has great challenges due to high memory usage and computational costs. Massive online analysis frame work is a software environment used to perform frequent pattern mining using INC_MINE algorithm. The algorithm uses the method of closed frequent mining. The data sets used in the analysis are Electricity data set and Airline data set. The authors also generated their own data set, OUR-GENERATOR for the purpose of analysis and the results are found interesting. In the experiments five samples of instance sizes (10000, 15000, 25000, 35000, 50000) are used with varying minimum support and window sizes for determining frequent closed itemsets and semi frequent closed itemsets respectively. The present work establishes that association rule mining could be performed even in the case of data stream mining by INC_MINE algorithm by generating closed frequent itemsets which is first of its kind in the literature

Elsevier - Publisher Connector

ePrints@Bangalore University

Integrated Clustering and Anomaly Detection (INCAD) for Streaming Data (Revised)

Author: Charles E. Antoniak
David A. Clifton
E Eskin
J French
J Gama
M Goldstein
Matthew S. Shotwell
N Görnitz
N Hjort
N Jiang
P. García-Teodoro
RM Neal
V Chandola
Yee Whye Teh
Z He
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/10/2019
Field of study

Most current clustering based anomaly detection methods use scoring schema and thresholds to classify anomalies. These methods are often tailored to target specific data sets with "known" number of clusters. The paper provides a streaming clustering and anomaly detection algorithm that does not require strict arbitrary thresholds on the anomaly scores or knowledge of the number of clusters while performing probabilistic anomaly detection and clustering simultaneously. This ensures that the cluster formation is not impacted by the presence of anomalous data, thereby leading to more reliable definition of "normal vs abnormal" behavior. The motivations behind developing the INCAD model and the path that leads to the streaming model is discussed.Comment: 13 pages; fixes typos in equations 5,6,9,10 on inference using Gibbs samplin

arXiv.org e-Print Archive

Crossref