Search CORE

12 research outputs found

Finding Frequent Patterns in a Large Sparse Graph*

Author: A. Inokuchi
B.D. McKay
D.J. Cook
D.J. Cook
D.J. Cook
D.S. Hochbaum
E.M. Mitchell
George Karypis
H.M. Berman
H.M. Grindley
I. Jonyer
I. Jonyer
I. Koch
J.M. Kleinberg
J.M. Kleinberg
J.M. Robson
J.W. Raymond
K. Yoshida
K. Yoshida
M. Kuramochi
M.M. Halldórsson
M.R. Garey
Michihiro Kuramochi
N. Leibowitz
P.R.J. Östergård
R.C. Read
S.H. Muggleton
W. Lee
X. Pennec
X. Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

A latitudinal study on the use of sequential and concurrency patterns in deviance mining

Author: C Diamantini
C Jiang
G Chandrashekar
G Greco
I Jonyer
Jiawei Han
M Kuramochi
SY Hwang
W Aalst van der
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2020
Field of study

\u3cp\u3eDeviance mining is an emerging area in the field of Process Mining, with the aim of explaining the differences between normal and deviant process executions. Deviance mining approaches typically extract representative subprocesses characterizing normal/deviant behaviors from an event log and use these subprocesses as features for classification. Existing approaches mainly differ for the employed feature extraction technique and, in particular, for the representation of the patterns extracted, ranging from patterns consisting of sequence of activities to patterns explicitly representing concurrency. In this work, we perform a latitudinal study on the use of sequential and concurrency patterns in deviance mining. Comparisons between sequential and concurrency patterns is performed through experiments on two real-world event logs, by varying both classification and feature extraction algorithms. Our results show that the pattern representation has limited impact on classification performance, while the use of concurrency patterns provides more meaningful insights on deviant behavior.\u3c/p\u3

Repository TU/e

Crossref

Pure OAI Repository

IRIS UniversitÃ Politecnica delle Marche

A Latitudinal Study on the Use of Sequential and Concurrency Patterns in Deviance Mining

Author: C Diamantini
C Jiang
G Chandrashekar
G Greco
I Jonyer
Jiawei Han
M Kuramochi
SY Hwang
W Aalst van der
Publication venue
Publication date
Field of study

Deviance mining is an emerging area in the field of Process Mining, with the aim of explaining the differences between normal and deviant process executions. Deviance mining approaches typically extract representative subprocesses characterizing normal/deviant behaviors from an event log and use these subprocesses as features for classification. Existing approaches mainly differ for the employed feature extraction technique and, in particular, for the representation of the patterns extracted, ranging from patterns consisting of sequence of activities to patterns explicitly representing concurrency. In this work, we perform a latitudinal study on the use of sequential and concurrency patterns in deviance mining. Comparisons between sequential and concurrency patterns is performed through experiments on two real-world event logs, by varying both classification and feature extraction algorithms. Our results show that the pattern representation has limited impact on classification performance, while the use of concurrency patterns provides more meaningful insights on deviant behavior

Crossref

IRIS UniversitÃ Politecnica delle Marche

A stochastic context free grammar based framework for analysis of protein sequences

Author: A Golovin
A Krogh
AC Wallace
B Feng
B Keller
B Knudsen
B Robson
CJA Sigrist
CW Cleverdon
D Wadowski
DB Searls
DB Searls
DE Goldberg
DT Jones
EM Gold
GD Forney
GE Revesz
H Mamitsuka
HM Berman
I Jonyer
J Arabas
J Davis
J Hopcroft
J Kupiec
J Maczka
Jean-Christophe Nebel
JH Holland
JK Baker
JL Fauchere
JR Koza
K Nakai
K Tomii
KS Pollard
LE Baum
M Mernik
M Wall
MA Jimenez-Montao
MI Kanehisa
N Abe
N Chomsky
N Hulo
NJ Mulder
P Klein
PP Vaidyanathan
PR Dupont
PY Chou
R Durbin
S Eddy
S Geman
S Kawashima
S Lonardi
T Head
T Ishikawa
TK Attwood
UniProt Consortium
V Biou
V Brendel
W Dyrka
W Dyrka
Witold Dyrka
Y Sakakibara
Y Sakakibara
Y Sakakibara
Y Sakakibara
Y Sakakibara
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Abstract Background In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity of relationship between amino acids have mainly limited the application of formal language theory to the production of grammars whose expressive power is not higher than stochastic regular grammars. However, these grammars, like other state of the art methods, cannot cover any higher-order dependencies such as nested and crossing relationships that are common in proteins. In order to overcome some of these limitations, we propose a Stochastic Context Free Grammar based framework for the analysis of protein sequences where grammars are induced using a genetic algorithm. Results This framework was implemented in a system aiming at the production of binding site descriptors. These descriptors not only allow detection of protein regions that are involved in these sites, but also provide insight in their structure. Grammars were induced using quantitative properties of amino acids to deal with the size of the protein alphabet. Moreover, we imposed some structural constraints on grammars to reduce the extent of the rule search space. Finally, grammars based on different properties were combined to convey as much information as possible. Evaluation was performed on sites of various sizes and complexity described either by PROSITE patterns, domain profiles or a set of patterns. Results show the produced binding site descriptors are human-readable and, hence, highlight biologically meaningful features. Moreover, they achieve good accuracy in both annotation and detection. In addition, findings suggest that, unlike current state-of-the-art methods, our system may be particularly suited to deal with patterns shared by non-homologous proteins. Conclusion A new Stochastic Context Free Grammar based framework has been introduced allowing the production of binding site descriptors for analysis of protein sequences. Experiments have shown that not only is this new approach valid, but produces human-readable descriptors for binding sites which have been beyond the capability of current machine learning techniques.</p

Crossref

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Kingston University Research Repository

Hierarchical conceptual clustering based on quantile method for identifying microscopic details in distributional data

Author: AK Jain
DH Fisher
E Diday
FdA de Carvalho
I Jonyer
L Billard
L Hubert
L Vendramin
M Ichino
M Ichino
S Goswami
SC Johnson
Y El-Sonbaty
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

State of the art of graph-based data mining

Author: Agrawal R.
De Raedt L.
Debnath A.
Geibel P.
Hiroshi Motoda
Jonyer I.
Kashima H.
Liquiere M.
Mannila H.
Nijssen S.
Srinivasan A.
Takashi Washio
Vapnik V.
Yan X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Scalable semantic analytics on social networks for addressing the problem of conflict of interest detection

Author: Amit Sheth
Anderson R.
Anupam Joshi
Boanerges Aleman-Meza
Crescenzi V.
Ding L.
Garton L.
Hollywood J.
Horrocks I.
I. Budak Arpinar
Jonyer I.
Kalashnikov D.
Kautz H.
Li Ding
Meenakshi Nagarajan
Miller E.
Neville J.
Sheth A. P.
Tim Finin
Xu J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Keyword-Based Search of Workflow Fragments and Their Composition

Author: A Awad
A Goderis
A Koschmider
B Giardine
C Diamantini
D Leake
D Roure De
H Leopold
I Jonyer
J Bae
J Starlinger
M Harmassi
ML Rosa
N Peters
P Mates
R Bergmann
The Huntington’s Disease Collaborative Research Group
U Cayoglu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

LNCS, volume 10190; TCCI, volume 10190Workflow specification, in science as in business, can be a difficult task, since it requires a deep knowledge of the domain to be able to model the chaining of the steps that compose the process of interest, as well as awareness of the computational tools, e.g., services, that can be utilized to enact such steps. To assist designers in this task, we investigate in this paper a methodology that consists in exploiting existing workflow specifications that are stored and shared in repositories, to identify workflow fragments that can be re-utilized and re-purposed by designers when specifying new workflows. Specifically, we present a method for identifying fragments that are frequently used across workflows in existing repositories, and therefore are likely to incarnate patterns that can be reused in new workflows. We present a keyword-based search method for identifying the fragments that are relevant for the needs of a given workflow designer. We go on to present an algorithm for composing the retrieved fragments with the initial (incomplete) workflow that the user designed, based on compatibility rules that we identified, and showcase how the algorithm operates using an example from eScience

Crossref

Assessing the quality of multilevel graph clustering

Author: A Vespignani
B Good
D Chakrabarti
François Queyroi
G Pflieger
Guy Melançon
I Jonyer
J Rouwendal
Jean-Marc Fédou
M Girvan
M Mishna
Maylis Delest
MEJ Newman
MP Delest
P Erdös
P Pons
R Patuelli
S Fortunato
SE Schaeffer
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Mining local process models and their correlations

Author: C Diamantini
D Chapela-Campa
E Ramezani
G Greco
H Mannila
I Jonyer
J Carmona
JMEM Werf van der
K Järvelin
KY Huang
L Genga
L Măruşter
M Leemans
N Tax
P Fournier-Viger
Philippe Fournier-Viger
RP Jagadeesh Chandra Bose
S Schönig
SJJ Leemans
W Reisig
W Zhou
WMP Aalst van der
WMP Aalst van der
X Lu
Z Huang
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2019
Field of study

\u3cp\u3eMining local patterns of process behavior is a vital tool for the analysis of event data that originates from flexible processes, which in general cannot be described by a single process model without overgeneralizing the allowed behavior. Several techniques for mining local patterns have been developed over the years, including Local Process Model (LPM) mining, episode mining, and the mining of frequent subtraces. These pattern mining techniques can be considered to be orthogonal, i.e., they provide different types of insights on the behavior observed in an event log. In this work, we demonstrate that the joint application of LPM mining and other patter mining techniques provides benefits over applying only one of them. First, we show how the output of a subtrace mining approach can be used to mine LPMs more efficiently. Secondly, we show how instances of LPMs can be correlated together to obtain larger LPMs, thus providing a more comprehensive overview of the overall process. We demonstrate both effects on a collection of real-life event logs.\u3c/p\u3

Repository TU/e

Crossref

Pure OAI Repository