Search CORE

26 research outputs found

Partitioning clustering algorithms for protein sequence data sets

Author: A Enright
A Enright
A Herger
A Krause
DW Mount
E Bolten
E Kriventseva
F Can
G Yona
H Cathy
H Spath
J Hartigan
J Shi
KJ Anil
L Kaufman
Mohamed Limam
N Essoussi
Nadia Essoussi
O Sasson
P Cabena
P Clote
P Pipenbacher
P Sperisen
R Ng
R Tatusov
RC Dubes
S Altschul
S Henikoff
S Schneckener
S Van Dongen
SB Needleman
SE Brenner
Sondes Fayech
TF Smith
UM Fayyad
V Faber
V Guralnik
WR Pearson
Z Wu
Publication venue: BioMed Central
Publication date: 01/04/2009
Field of study

Abstract Background Genome-sequencing projects are currently producing an enormous amount of new sequences and cause the rapid increasing of protein sequence databases. The unsupervised classification of these data into functional groups or families, clustering, has become one of the principal research objectives in structural and functional genomics. Computer programs to automatically and accurately classify sequences into families become a necessity. A significant number of methods have addressed the clustering of protein sequences and most of them can be categorized in three major groups: hierarchical, graph-based and partitioning methods. Among the various sequence clustering methods in literature, hierarchical and graph-based approaches have been widely used. Although partitioning clustering techniques are extremely used in other fields, few applications have been found in the field of protein sequence clustering. It is not fully demonstrated if partitioning methods can be applied to protein sequence data and if these methods can be efficient compared to the published clustering methods. Methods We developed four partitioning clustering approaches using Smith-Waterman local-alignment algorithm to determine pair-wise similarities of sequences. Four different sets of protein sequences were used as evaluation data sets for the proposed methods. Results We show that these methods outperform several other published clustering methods in terms of correctly predicting a classifier and especially in terms of the correctness of the provided prediction. The software is available to academic users from the authors upon request.</p

Crossref

Directory of Open Access Journals

PubMed Central

Statistical strategies for avoiding false discoveries in metabolomics and related experiments

Author: A. Bradford Hill
A. Cornish-Bowden
A. Demiriz
A. Goffeau
A. Golbraikh
A. Hutchinson
A. Linden
A. Reiner
A. Saltelli
A.C. Leon
A.C. Tas
A.H. Fielding
A.J. Miller
A.W.F. Edwards
B. Efron
B. Efron
B. Efron
B. Fortner
B. Shipley
B.F.J. Manly
B.K. Alsberg
B.K. Alsberg
B.R. Kirkwood
B.S. Everitt
C. Chatfield
C. Mering von
C. Rijsbergen van
C. Stephan
C.A. Coello
C.A. Goble
C.B. Lucasius
C.B. Lucasius
C.E. Metz
C.J. Needham
C.R. Hicks
D. Broadhurst
D. Camacho
D. di Bernardo
D. Edwards
D. Hand
D.A. Berry
D.A. Fell
D.A. Veldhuizen Van
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.C. Montgomery
D.F. Ransohoff
D.F. Ransohoff
D.F. Ransohoff
D.G. Altman
D.G. Altman
D.J.C. Mackay
D.S. Grimes
David I. Broadhurst
Douglas B. Kell
E. Jellum
E. Urbanczyk-Wochniak
E. Zitzler
E.C. Horning
E.E. Ntzani
E.F. Petricoin III
E.P. Diamandis
E.R. Gansner
E.R. Tufte
F. Kose
F.V. Jensen
G. Casella
G.A.F. Seber
G.E.P. Box
G.G. Harrigan
G.S. Catchpole
H. Brenner
H. Martens
H. White
H.-X. Li
H.C. Frey
H.L. Kirschenlohr
H.V. Westerhoff
H.W. Ressom
I.T. Jolliffe
J. Cornfield
J. Handl
J. Pearl
J. Pearl
J. Sacks
J. Zupan
J.A. Hanley
J.A. Todd
J.D. Barrow
J.D. Storey
J.D. Storey
J.E. Oakley
J.H. Zhang
J.J. Rowland
J.L. Ringuest
J.M. Bernardo
J.M. Bland
J.P. Egan
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.R. Koza
J.R. Koza
J.W. Sammon Jr.
J.W. Tukey
K. Bennett
K. Deb
K.A. Baggerly
K.J. Rothman
L. Breiman
L. Breiman
L. Ein-Dor
L. Eriksson
L. Hubert
L. Wilkinson
L.A. Zadeh
L.G. Valiant
L.J. ‘t Veer van
L.M. Raamsdonk
M. Anthony
M. Bland
M. Brown
M. Cascante
M. Chen
M. Friendly
M. Hollander
M. Peleg
M. Ramoni
M. Woodward
M.B. Seasholtz
M.H. Zweig
M.J. Gardner
M.J. Vijver van de
M.J.A. Berry
M.S. Sehgal
N. Rifai
N.A. Obuchowski
O. Troyanskaya
O.P. Rud
P. Adriaans
P. Baldi
P. Cabena
P. Dasgupta
P. Duesberg
P. Eades
P. Langley
P. Romano
P.E. Rapp
P.R. Williamson
R. Bellman
R. Brent
R. Brent
R. Brent
R. Goodacre
R. Goodacre
R. Heinrich
R. Judson
R. Kruse
R. Royall
R. Steuer
R. Steuer
R. Stevens
R.E. Shaffer
R.F. Raubertas
R.G. Brereton
R.H. Myers
R.J. Cook
R.M. Jarvis
R.O. Duda
R.R. Sokal
S. Natarajan
S. O’Hagan
S. Wacholder
S. Wold
S.B. Crary
S.C. Potter
S.G. Baker
S.G. Oliver
S.H. Jung
S.H. Weiss
S.J. Sharp
S.K. Kim
S.M. Weiss
S.N. Deming
S.N. Goodman
T. Hastie
T. Kamada
T. Kohonen
T. Oinn
T. Oinn
T.A. White
T.M. Mitchell
T.M.D. Ebbels
T.M.J. Fruchterman
T.R. Golub
T.V. Perneger
U. Horchner
V.C.P. Chen
V.J. Gillet
V.N. Vapnik
W. Greenaway
W. Weckwerth
W.B. Kannel
W.B. Langdon
W.E. Evans
W.E. Evans
W.E. Evans
W.J. Conover
W.J. Krzanowski
W.S. Cleveland
W.S. Cleveland
X. Cui
X. Zhou
X.H. Zhou
Y. Benjamini
Y. Liang
Y. Tu
Y. Wang
Y. Xie
Z. Michalewicz
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Sequence Rules for Web Clickstream Analysis

Author: J. Srivastava
P. Cabena
P. Giudici
S. Lauritzen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Sequence rules for web clickstream analysis

Author: J. Srivastava
P. Cabena
P. Giudici
S. Lauritzen
Publication venue: Springer Verlag
Publication date
Field of study

Crossref

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

Overview of Knowledge Discovery and Data Mining Process Models

Author: Berry M.
Cabena P.
Fayyad U. M.
Maynika J.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

A Review of: “Discovering Knowledge in Data: An Introduction to Data Mining”

Author: Borok L. S.
Cabena P.
Fayyad U. M.
Han J.
S. Omid Fatemieh
Szolvits P.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

RESEARCH ON DATA MINING AND KNOWLEDGE MANAGEMENT AND ITS APPLICATIONS IN CHINA'S ECONOMIC DEVELOPMENT: SIGNIFICANCE AND TREND

Author: Alavi M.
Awad E. M.
Cabena P.
RUWEI DAI
SIWEI CHENG
WEIXUAN XU
YONG SHI
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref

A KDD Experience Factory: Using Textual CBR for Reusing Lessons Learned

Author: A. Aamodt
D. J. Hand
K. A. Bartlmae
M. Lenz
P. Cabena
R. Bergmann
U. Fayyad
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

CRISP-eSNeP: Towards a data-driven knowledge discovery process for electronic social networks

Author: Blau P.M.
Cabena P.
Daniel Adomako Asamoah
Li Y.
Ola O.
Ramesh Sharda
Shearer C.
Tufekci Z.
Watson H.J.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Mining reference process models and their configurations

Author: A. Rozinat
A. Rozinat
B.F. Dongen van
D. Pyle
F. Gottschalk
G. Keller
J. Becker
M. Rosemann
O. Thomas
P. Cabena
P. Fettke
S. Zhang
W.M.P. Aalst van der
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2008
Field of study

Reference process models are templates for common processes run by many corporations. However, the individual needs among organizations on the execution of these processes usually vary. A process model can address these variations through control-flow choices. Thus, it can integrate the different process variants into one model. Through configuration parameters, a configurable reference models enables corporations to derive their individual process variant from such an integrated model. While this simplifies the adaptation process for the reference model user, the construction of a configurable model integrating several process variants is far more complex than the creation of a traditional reference model depicting a single best-practice variant. In this paper we therefore recommend the use of process mining techniques on log files of existing, well-running IT systems to help the reference model provider in creating such integrated process models. Afterwards, the same log files are used to derive suggestions for common configurations that can serve as starting points for individual configurations

Repository TU/e

Crossref

Pure OAI Repository