Search CORE

160 research outputs found

Research on Pattern Matching with Wildcards and Length Constraints: Methods and Completeness

Author: Hu Xuegang
Wang Haiping
Xiang Taining
Publication venue: 'IntechOpen'
Publication date: 28/11/2012
Field of study

DESQ: Frequent Sequence Mining with Subsequence Constraints

Author: Beedkar Kaustubh
Gemulla Rainer
Publication venue
Publication date: 01/01/2016
Field of study

Frequent sequence mining methods often make use of constraints to control which subsequences should be mined. A variety of such subsequence constraints has been studied in the literature, including length, gap, span, regular-expression, and hierarchy constraints. In this paper, we show that many subsequence constraints---including and beyond those considered in the literature---can be unified in a single framework. A unified treatment allows researchers to study jointly many types of subsequence constraints (instead of each one individually) and helps to improve usability of pattern mining systems for practitioners. In more detail, we propose a set of simple and intuitive "pattern expressions" to describe subsequence constraints and explore algorithms for efficiently mining frequent subsequences under such general constraints. Our algorithms translate pattern expressions to compressed finite state transducers, which we use as computational model, and simulate these transducers in a way suitable for frequent sequence mining. Our experimental study on real-world datasets indicates that our algorithms---although more general---are competitive to existing state-of-the-art algorithms.Comment: Long version of the paper accepted at the IEEE ICDM 2016 conferenc

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server

Mining for Frequent Events in Time Series

Author: Stoecker-Sylvia Zachary
Publication venue: Digital WPI
Publication date: 02/09/2004
Field of study

While much work has been done in mining nominal sequential data much less has been done on mining numeric time series data. This stems primarily from the problems of relating numeric data, which likely contains error or other variations which make directly relating values difficult. To handle this problem, many algorithms first convert data into a sequence of events. In some cases these events are known a priori, but in others they are not. Our work evaluates a set of time series data instances in order to determine likely candidates for unknown underlying events. We use the concept of bounding envelopes to represent the area around a numeric time series in which the unknown noise-free points could exist. We then use an algorithm similar to Apriori to build up sets of envelope intersections. The areas created by these intersections represent common patterns found throughout the data

DigitalCommons@WPI

多様なポストゲノムデータのためのアラインメントフリーなアルゴリズムの構造

Author: Onodera Taku
小野寺拓
Publication venue: 情報理工学系研究科コンピュータ科学専攻
Publication date: 11/11/2015
Field of study

学位の種別: 課程博士審査委員会委員 : （主査）東京大学教授今井浩, 東京大学教授小林直樹, 東京大学教授五十嵐健夫, 東京大学教授杉山将, 東京大学講師笠原雅弘University of Tokyo(東京大学

OPR-Miner: Order-preserving rule mining for time series

Author: Fournier-Viger Philippe
Guo Lei
Li Yan
Wu Xindong
Wu Youxi
Zhao Xiaoqian
Zhu Xingquan
Publication venue
Publication date: 09/10/2022
Field of study

Discovering frequent trends in time series is a critical task in data mining. Recently, order-preserving matching was proposed to find all occurrences of a pattern in a time series, where the pattern is a relative order (regarded as a trend) and an occurrence is a sub-time series whose relative order coincides with the pattern. Inspired by the order-preserving matching, the existing order-preserving pattern (OPP) mining algorithm employs order-preserving matching to calculate the support, which leads to low efficiency. To address this deficiency, this paper proposes an algorithm called efficient frequent OPP miner (EFO-Miner) to find all frequent OPPs. EFO-Miner is composed of four parts: a pattern fusion strategy to generate candidate patterns, a matching process for the results of sub-patterns to calculate the support of super-patterns, a screening strategy to dynamically reduce the size of prefix and suffix arrays, and a pruning strategy to further dynamically prune candidate patterns. Moreover, this paper explores the order-preserving rule (OPR) mining and proposes an algorithm called OPR-Miner to discover strong rules from all frequent OPPs using EFO-Miner. Experimental results verify that OPR-Miner gives better performance than other competitive algorithms. More importantly, clustering and classification experiments further validate that OPR-Miner achieves good performance

arXiv.org e-Print Archive

Scalable frequent sequence mining with flexible subsequence constraints

Author: Bertsch Mattias
Gemulla Rainer
Renz-Wieland Alexander
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

We study scalable algorithms for frequent sequence mining under flexible subsequence constraints. Such constraints enable applications to specify concisely which patterns are of interest and which are not. We focus on the bulk synchronous parallel model with one round of communication; this model is suitable for platforms such as MapReduce or Spark. We derive a general framework for frequent sequence mining under this model and propose the D-SEQ and D-CAND algorithms within this framework. The algorithms differ in what data are communicated and how computation is split up among workers. To the best of our knowledge, D-SEQ and D-CAND are the first scalable algorithms for frequent sequence mining with flexible constraints. We conducted an experimental study on multiple real-world datasets that suggests that our algorithms scale nearly linearly, outperform common baselines, and offer acceptable generalization overhead over existing, less general mining algorithms

Crossref

MAnnheim DOCument Server

Analyzing very large time series using suffix arrays

Author
Publication venue: Springer
Publication date: 01/10/2014
Field of study

Springer - Publisher Connector

A schema framework for graph event data

Author: Esser S.
Publication venue
Publication date: 19/02/2020
Field of study

Pure OAI Repository

IoT-MQTT based denial of service attack modelling and detection

Author: Syed Naeem Firdous
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2020
Field of study

Internet of Things (IoT) is poised to transform the quality of life and provide new business opportunities with its wide range of applications. However, the bene_ts of this emerging paradigm are coupled with serious cyber security issues. The lack of strong cyber security measures in protecting IoT systems can result in cyber attacks targeting all the layers of IoT architecture which includes the IoT devices, the IoT communication protocols and the services accessing the IoT data. Various IoT malware such as Mirai, BASHLITE and BrickBot show an already rising IoT device based attacks as well as the usage of infected IoT devices to launch other cyber attacks. However, as sustained IoT deployment and functionality are heavily reliant on the use of e_ective data communication protocols, the attacks on other layers of IoT architecture are anticipated to increase. In the IoT landscape, the publish/- subscribe based Message Queuing Telemetry Transport (MQTT) protocol is widely popular. Hence, cyber security threats against the MQTT protocol are projected to rise at par with its increasing use by IoT manufacturers. In particular, the Internet exposed MQTT brokers are vulnerable to protocolbased Application Layer Denial of Service (DoS) attacks, which have been known to cause wide spread service disruptions in legacy systems. In this thesis, we propose Application Layer based DoS attacks that target the authentication and authorisation mechanism of the the MQTT protocol. In addition, we also propose an MQTT protocol attack detection framework based on machine learning. Through extensive experiments, we demonstrate the impact of authentication and authorisation DoS attacks on three opensource MQTT brokers. Based on the proposed DoS attack scenarios, an IoT-MQTT attack dataset was generated to evaluate the e_ectiveness of the proposed framework to detect these malicious attacks. The DoS attack evaluation results obtained indicate that such attacks can overwhelm the MQTT brokers resources even when legitimate access to it was denied and resources were restricted. The evaluations also indicate that the proposed DoS attack scenarios can signi_cantly increase the MQTT message delay, especially in QoS2 messages causing heavy tail latencies. In addition, the proposed MQTT features showed high attack detection accuracy compared to simply using TCP based features to detect MQTT based attacks. It was also observed that the protocol _eld size and length based features drastically reduced the false positive rates and hence, are suitable for detecting IoT based attacks

Research Online @ ECU