Search CORE

156 research outputs found

A New Approach for Mining Order-Preserving Submatrices Based on All Common Subsequences

Author: Jie Luo
Meihang Li
Qiuhua Kuang
Tiechen Li
Xiaohui Hu
Yun Xue
Zhengling Liao
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Order-preserving submatrices (OPSMs) have been applied in many fields, such as DNA microarray data analysis, automatic recommendation systems, and target marketing systems, as an important unsupervised learning model. Unfortunately, most existing methods are heuristic algorithms which are unable to reveal OPSMs entirely in NP-complete problem. In particular, deep OPSMs, corresponding to long patterns with few supporting sequences, incur explosive computational costs and are completely pruned by most popular methods. In this paper, we propose an exact method to discover all OPSMs based on frequent sequential pattern mining. First, an existing algorithm was adjusted to disclose all common subsequence (ACS) between every two row sequences, and therefore all deep OPSMs will not be missed. Then, an improved data structure for prefix tree was used to store and traverse ACS, and Apriori principle was employed to efficiently mine the frequent sequential pattern. Finally, experiments were implemented on gene and synthetic datasets. Results demonstrated the effectiveness and efficiency of this method

Crossref

Directory of Open Access Journals

BicSPAM: flexible biclustering using sequential patterns

Author: A Ben-Dor
A Califano
A Patrikainen
A Prelić
A Serin
A Tanay
AA Alizadeh
AR Donders
C Creighton
C Ding
C Tang
D Bozdağ
D Martin
DS Hochbaum
F Zhu
G Atluri
G Bebek
G Getz
G Pandey
GF Berriz
H Choi
H Toivonen
H Wang
J Bellay
J Han
J Ihmels
J Liu
J Liu
J Pei
J Wang
J Yang
JA Hartigan
K Sim
K Yip
L Lazzeroni
M Charrad
M de Souto
M Steinbach
MA Mahfouz
MJ Zaki
NR Mabroukeh
O Troyanskaya
P Carmona-Saez
P Fournier-Viger
Q Fang
Q Sheng
R Henriques
R Henriques
R Martinez
Rui Henriques
S Barkow
S Hochreiter
S Madeira
S Tavazoie
Sara C Madeira
SC Madeira
SS Young
T Calders
T Hellem
TR Golub
U Alon
X Yan
Y Huang
Y Okada
Y Okada
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Mining Order-Preserving Submatrices from Data with Repeated Measurements

Author: Cheung DWL
Chui CK
Kao BCM
Lee SD
Yip KY
Zhu X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

published_or_final_versio

HKU Scholars Hub

Computational Aesthetics and Identification of Working Style

Author: Tkachuk Dmytro
Publication venue
Publication date: 01/01/2018
Field of study

Tänapäeval kasutab meeletu hulk ettevõtteid protsessimudelitel põhinevate äriprotsesside haldamiseks, teostamiseks, monitoorimiseks ja analüüsimiseks protsessiteadlikke infosüsteeme. Lisaks genereerivad need tarkvarasüsteemid monitoorimisetapi osana ka sündmuste logisid, mis kujutavad endast tegelikku faktidest tuletatud (aposteriori) töövoogu ning neid analüüsitakse protsessiandmete hankimise tehnikate abil. Selles töös, osana protsessiandmete hankimisest, tutvustame tööstiili kontseptsiooni töö olemuse kõikehõlmava analüüsi tööriistana. Äriprotsesse ja komponentidevahelist vastastikust sõltuvust saab hinnata tööstiili perspektiivist, mis väljendub meetmetes ja mustrites. Defineerime uuendusliku sündmuste logi esitlemise lähenemise, kus logifaili käsitletakse kujutisena. Lisaks pakume välja meetmete arvutamise ja mustrite identifitseerimise algoritmid, mis põhinevad kujutiste analüüsitehnika ja arvutusesteetika kombinatsioonil. Selle tulemusena on loodud tööstiili hindamise veebipõhise rakenduse prototüüp.Nowadays, an enormous amount of companies use Process-Aware Information Systems to manage, perform, monitor and analyze business processes based on process models. Moreover, as a part of the monitoring stage, these software systems generate event logs, which represent actual a-posteriori workflow and are analyzed by process mining techniques. In this work, as a part of process mining, we introduce the concept of working style as the tool for comprehensive analysis of the nature of work. Business processes and interdependencies between its constituents can be evaluated from the perspective of working style which is represented by measures and patterns. We define the novel event log representation approach, where the log file is treated as an image. Additionally, we propose measure computation and pattern identification algorithms based on image analysis technique in combination with computational aesthetics. As a result, the web-based prototype application for working style evaluation has been built

DSpace at Tartu University Library

New approaches for clustering high dimensional data

Author: Liu Jinze
Publication venue
Publication date: 01/12/2006
Field of study

Clustering is one of the most effective methods for analyzing datasets that contain a large number of objects with numerous attributes. Clustering seeks to identify groups, or clusters, of similar objects. In low dimensional space, the similarity between objects is often evaluated by summing the difference across all of their attributes. High dimensional data, however, may contain irrelevant attributes which mask the existence of clusters. The discovery of groups of objects that are highly similar within some subsets of relevant attributes becomes an important but challenging task. My thesis focuses on various models and algorithms for this task. We first present a flexible clustering model, namely OP-Cluster (Order Preserving Cluster). Under this model, two objects are similar on a subset of attributes if the values of these two objects induce the same relative ordering of these attributes. OPClustering algorithm has demonstrated to be useful to identify co-regulated genes in gene expression data. We also propose a semi-supervised approach to discover biologically meaningful OP-Clusters by incorporating existing gene function classifications into the clustering process. This semi-supervised algorithm yields only OP-clusters that are significantly enriched by genes from specific functional categories. Real datasets are often noisy. We propose a noise-tolerant clustering algorithm for mining frequently occuring itemsets. This algorithm is called approximate frequent itemsets (AFI). Both the theoretical and experimental results demonstrate that our AFI mining algorithm has higher recoverability of real clusters than any other existing itemset mining approaches. Pair-wise dissimilarities are often derived from original data to reduce the complexities of high dimensional data. Traditional clustering algorithms taking pair-wise dissimilarities as input often generate disjoint clusters from pair-wise dissimilarities. It is well known that the classification model represented by disjoint clusters is inconsistent with many real classifications, such gene function classifications. We develop a Poclustering algorithm, which generates overlapping clusters from pair-wise dissimilarities. We prove that by allowing overlapping clusters, Poclustering fully preserves the information of any dissimilarity matrices while traditional partitioning algorithms may cause significant information loss

Carolina Digital Repository

Biclustering Based on FCA and Partition Pattern Structures for Recommendation Systems

Author: Codocedo Victor
Couceiro Miguel
Juniarta Nyoman
Napoli Amedeo
Publication venue: HAL CCSD
Publication date: 13/07/2018
Field of study

International audienceThis paper focuses on item recommendation for visitors in a museum within the framework of European Project CrossCult about cultural heritage. We present a theoretical research work about recommendation using biclustering. Our approach is based on biclustering using FCA and partition pattern structures. First, we recall a previous method of recommendation based on constant-column biclusters. Then, we propose an alternative approach that incorporates an order information and that uses coherent-evolution-on-columns biclusters. This alternative approach shares some common features with sequential pattern mining. Finally, given a dataset of visitor trajectories, we indicate how these approaches can be used to build a collaborative recommendation strategy

INRIA a CCSD electronic archive server

BicNET: Flexible module discovery in large-scale biological networks using biclustering

Author: A Ben-Dor
A Javed
A Kirsch
A Mukhopadhyay
A Mukhopadhyay
A Prelic
A Serin
A Tanay
AL Barabasi
CC Aggarwal
D Bozdag
D Martin
D Szklarczyk
DJ Reiss
E Eden
E Georgii
E Segal
F Bonchi
G Ramesh
GA Pavlopoulos
H-Y Chuang
I Farkas
J Bellay
J Berg
J Chen
J Han
J Ihmels
J Liu
J Pei
JB Pereira-Leal
JI MacPherson
JLY Koh
JM Cherry
M Teixeira
MJ Zaki
MT Dittrich
O Odibat
P Dao
R Agrawal
R Colak
R Das
R Henriques
R Henriques
R Henriques
R Henriques
R Martinez
R Sharan
R Sharan
Rui Henriques
S Barkow
S Hochreiter
S Mitra
SA Chowdhury
Sara C. Madeira
SC Madeira
T Ideker
T Ideker
TM Murali
U Maulik
V Bo
V Spirin
W DuMouchel
W DuMouchel
Y Cheng
Y Okada
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref