Search CORE

83 research outputs found

A Review of Subsequence Time Series Clustering

Author: Saeed Aghabozorgi
Seyedjamal Zolhavarieh
Ying Wah Teh
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies

Crossref

Directory of Open Access Journals

PubMed Central

On mining complex sequential data by means of FCA and pattern structures

Author: Buzmakov Aleksey
Egho Elias
Jay Nicolas
Kuznetsov Sergei O.
Napoli Amedeo
Raïssi Chedy
Publication venue
Publication date: 09/04/2015
Field of study

Nowadays data sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using the elegant mathematical framework of Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patient patterns from a French healthcare data set on cancer. The quantitative and qualitative results (with annotations and analysis from a physician) are reported in this use case which is the main motivation for this work. Keywords: data mining; formal concept analysis; pattern structures; projections; sequences; sequential data.Comment: An accepted publication in International Journal of General Systems. The paper is created in the wake of the conference on Concept Lattice and their Applications (CLA'2013). 27 pages, 9 figures, 3 table

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

FCA and pattern structures for mining care trajectories

Author: Buzmakov Aleksey
Egho Elias
Jay Nicolas
Kuznetsov Sergei O.
Napoli Amedeo
Raïssi Chedy
Publication venue: HAL CCSD
Publication date: 05/08/2013
Field of study

International audienceIn this paper, we are interested in the analysis of sequential data and we propose an original framework based on Formal Concept Analysis (FCA). For that, we introduce sequential pattern structures, an original specification of pattern structures for dealing with sequential data. Pattern structures are used in FCA for dealing with complex data such as intervals or graphs. Here they are adapted to sequences. For that, we introduce a subsumption operation for sequence comparison, based on subsequence matching. Then, a projection, i.e. a kind of data reduction of sequential pattern structures, is suggested in order to increase the eficiency of the approach. Finally, we discuss an application to a dataset including patient trajectories (the motivation of this work), which is a sequential dataset and can be processed with the introduced framework. This research work provides a new and eficient extension of FCA to deal with complex (not binary) data, which can be an alternative to the analysis of sequential dataset

INRIA a CCSD electronic archive server

The representation of sequential patterns and their projections within Formal Concept Analysis

Author: Buzmakov Aleksey
Egho Elias
Jay Nicolas
Kuznetsov Sergei O.
Napoli Amedeo
Raïssi Chedy
Publication venue: HAL CCSD
Publication date: 23/09/2013
Field of study

International audienceNowadays data sets are available in very complex and heterogeneous ways. The mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using an elegant mathematical framework: Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patients' patterns from a French healthcare data set of cancer patients. The quantitative and qualitative results are reported in this use case which is the main motivation for this work

INRIA a CCSD electronic archive server

Mining iterative generators and representative rules for software specification discovery

Author: KHOO Siau-Cheng
LI Jinyan
LO David
WONG Limsoon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

10.1109/TKDE.2010.24IEEE Transactions on Knowledge and Data Engineering232282-296ITKE

CiteSeerX

Institutional Knowledge at Singapore Management University

OPUS - University of Technology Sydney

ScholarBank@NUS

International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at IJCAI 2013, Beijing, China, August 4 2013)

Author: Kuznetsov Sergei O.
Napoli Amedeo
Rudolph Sebastian
Publication venue: CEUR Proceedings
Publication date: 01/01/2013
Field of study

International audienceThis second edition of the FCA4AI workshop (the first edition was associated to the ECAI 2012 Conference, see http://www.fca4ai.hse.ru/), shows again that there are many AI researchers interested in FCA. Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification. FCA allows one to build a concept lattice and a system of dependencies (implications) which can be used for many AI needs, e.g. knowledge processing involving learning, knowledge discovery, knowledge representation and reasoning, ontology engineering, as well as information retrieval and text processing. Thus, there exist many natural links between FCA and AI. Accordingly, the focus in this workshop was on how can FCA support AI activities (knowledge processing) and how can FCA be extended in order to help AI researchers to solve new and complex problems in their domains

INRIA a CCSD electronic archive server

LeCo: Lightweight Compression via Learning Serial Correlations

Author: Liu Yihao
Zeng Xinyu
Zhang Huanchen
Publication venue
Publication date: 27/06/2023
Field of study

Lightweight data compression is a key technique that allows column stores to exhibit superior performance for analytical queries. Despite a comprehensive study on dictionary-based encodings to approach Shannon's entropy, few prior works have systematically exploited the serial correlation in a column for compression. In this paper, we propose LeCo (i.e., Learned Compression), a framework that uses machine learning to remove the serial redundancy in a value sequence automatically to achieve an outstanding compression ratio and decompression performance simultaneously. LeCo presents a general approach to this end, making existing (ad-hoc) algorithms such as Frame-of-Reference (FOR), Delta Encoding, and Run-Length Encoding (RLE) special cases under our framework. Our microbenchmark with three synthetic and six real-world data sets shows that a prototype of LeCo achieves a Pareto improvement on both compression ratio and random access speed over the existing solutions. When integrating LeCo into widely-used applications, we observe up to 3.9x speed up in filter-scanning a Parquet file and a 16% increase in Rocksdb's throughput

arXiv.org e-Print Archive

キョクダイハンシュツケイレツケンシュツオモチイタコードクローンノケンシュツ

Author: Yoshihisa Udagawa
宇田川佳久
東京工芸大学工学部コンピュータ応用学科教授
Publication venue
Publication date: 25/12/2017
Field of study

Tokyo Polytechnic University Repository / 東京工芸大学学術リポジトリ