Search CORE

1,134 research outputs found

Multiobjective optimization in bioinformatics and computational biology

Author: Handl Julia
Kell Douglas B.
Knowles Joshua
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2007
Field of study

The University of Manchester - Institutional Repository

Fuzzy clustering of univariate and multivariate time series by genetic multiobjective optimization

Author: R. Baragona
S. Bandyopadhyay
U. Maulik
Publication venue
Publication date
Field of study

Given a set of time series, it is of interest to discover subsets that share similar properties. For instance, this may be useful for identifying and estimating a single model that may fit conveniently several time series, instead of performing the usual identification and estimation steps for each one. On the other hand time series in the same cluster are related with respect to the measures assumed for cluster analysis and are suitable for building multivariate time series models. Though many approaches to clustering time series exist, in this view the most effective method seems to have to rely on choosing some features relevant for the problem at hand and seeking for clusters according to their measurements, for instance the autoregressive coe±cients, spectral measures or the eigenvectors of the covariance matrix. Some new indexes based on goodnessof-fit criteria will be proposed in this paper for fuzzy clustering of multivariate time series. A general purpose fuzzy clustering algorithm may be used to estimate the proper cluster structure according to some internal criteria of cluster validity. Such indexes are known to measure actually definite often conflicting cluster properties, compactness or connectedness, for instance, or distribution, orientation, size and shape. It is argued that the multiobjective optimization supported by genetic algorithms is a most effective choice in such a di±cult context. In this paper we use the Xie-Beni index and the C-means functional as objective functions to evaluate the cluster validity in a multiobjective optimization framework. The concept of Pareto optimality in multiobjective genetic algorithms is used to evolve a set of potential solutions towards a set of optimal non-dominated solutions. Genetic algorithms are well suited for implementing di±cult optimization problems where objective functions do not usually have good mathematical properties such as continuity, differentiability or convexity. In addition the genetic algorithms, as population based methods, may yield a complete Pareto front at each step of the iterative evolutionary procedure. The method is illustrated by means of a set of real data and an artificial multivariate time series data set.Fuzzy clustering, Internal criteria of cluster validity, Genetic algorithms, Multiobjective optimization, Time series, Pareto optimality

Research Papers in Economics

On combinatorial optimisation in analysis of protein-protein interaction and protein folding networks

Author: Chalupa David
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/03/2016
Field of study

Abstract: Protein-protein interaction networks and protein folding networks represent prominent research topics at the intersection of bioinformatics and network science. In this paper, we present a study of these networks from combinatorial optimisation point of view. Using a combination of classical heuristics and stochastic optimisation techniques, we were able to identify several interesting combinatorial properties of biological networks of the COSIN project. We obtained optimal or near-optimal solutions to maximum clique and chromatic number problems for these networks. We also explore patterns of both non-overlapping and overlapping cliques in these networks. Optimal or near-optimal solutions to partitioning of these networks into non-overlapping cliques and to maximum independent set problem were discovered. Maximal cliques are explored by enumerative techniques. Domination in these networks is briefly studied, too. Applications and extensions of our findings are discussed

Repository@Hull - Worktribe

Clustering Algorithms: Their Application to Gene Expression Data

Author: Agrawal R.
Alizadeh A.A.
Bandyopadhyay S.
Bandyopadhyay S.
Bezdek J.C.
Bezdek J.C.
Bezdek† J.C.
Bhargavi M.S.
Blatt M.
Bochkov Y.A.
Brunet J.P.
Bryan K.
Buitinck L.
Bunnik E.M.
Caliński T.
Chandrasekhar T.
Cheng Y.
Costa I.G.
Cover T.M.
D'haeseleer P.
Dave R.N.
Davies D.L.
De Morsier F.
Dempster A.P.
Dharmarajan A.
Dhillon I.S.
Divina F.
Do C.B.
Domany E.
Du Z.
Dunn† J.C.
Edla D.R.
Eisen M.B.
Ferguson T.S.
Frey B.J.
Fu L.
Fukuyama Y.
Galluccio L.
Gath I.
Getz G.
Gordon G.J.
Gu J.
Guha S.
Handhayani T.
Handl J.
Hatamlou A.
Heard N.A.
Heyer L.J.
Hinneburg A.
Hinneburg A.
Hu X.
Hubert L.J.
Jain A.K.
Jiang D.
Jiang H.
Joopudi S.
Kao Y.T.
Karmilasari S.W.
Karypis G.
Kaufman L.
Kerr G.
Kluger Y.
Kohonen T.
Kohonen T.
Krzanowski W.J.
Leone M.
Lu Y.
Lu Y.
Ma'sum M.A.
MacQueen J.
Madeira S.C.
Mann A.K.
Masciari E.
Maulik U.
Milligan G.W.
Mitra S.
Moon T.K.
Moore W.C.
Müllner D.
Nagpal A.
Nasser S.
Neal R.M.
Ng R.T.
Pakhira M.K.
Pal N.R.
Pedregosa F.
Pirim H.
Pitman J.
Prelić A.
Qin Z.S.
Raman S.
Rasmussen C.E.
Rezaee B.
Rezaee M.R.
Ruspini E.H.
Saha S.
Saha S.
Saha S.
Sathishkumar K.
Sheikholeslami G.
Sheng Q.
Sirinukunwattana K.
Sokal R.R.
Sun J.
Talaat A.M.
Tamayo P.
Tanay A.
Tang C.
Thalamuthu A.
Tibshirani R.
Wan M.
Wang L.
Wang W.
Williams G.
Wu J.
Wu K.L.
Wu S.
Xie X.L.
Xu R.
Xu Y.
Yu H.
Zhang D.
Zhang T.
Zhang Y.
Zhang Z.Y.
Zhao L.
Zhong C.
Zitnik M.
Řehůřek R.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2016
Field of study

Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and iden-tify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure

Covenant University Repository

Crossref

Directory of Open Access Journals

PubMed Central

A Multiobjective Evolutionary Conceptual Clustering Methodology for Gene Annotation Within Structural Databases: A Case of Study on the Gene Ontology Database

Author: Cordón Óscar
Herrera Francisco
Perren Cobb J.
Romero Zaliz Rocío C.
Rubio Escudero Cristina
Zwir Igor
Publication venue: IEEE Computer Society
Publication date: 01/01/2008
Field of study

Current tools and techniques devoted to examine the content of large databases are often hampered by their inability to support searches based on criteria that are meaningful to their users. These shortcomings are particularly evident in data banks storing representations of structural data such as biological networks. Conceptual clustering techniques have demonstrated to be appropriate for uncovering relationships between features that characterize objects in structural data. However, typical con ceptual clustering approaches normally recover the most obvious relations, but fail to discover the lessfrequent but more informative underlying data associations. The combination of evolutionary algorithms with multiobjective and multimodal optimization techniques constitutes a suitable tool for solving this problem. We propose a novel conceptual clustering methodology termed evolutionary multiobjective conceptual clustering (EMO-CC), re lying on the NSGA-II multiobjective (MO) genetic algorithm. We apply this methodology to identify conceptual models in struc tural databases generated from gene ontologies. These models can explain and predict phenotypes in the immunoinflammatory response problem, similar to those provided by gene expression or other genetic markers. The analysis of these results reveals that our approach uncovers cohesive clusters, even those comprising a small number of observations explained by several features, which allows describing objects and their interactions from different perspectives and at different levels of detail.Ministerio de Ciencia y Tecnología TIC-2003-00877Ministerio de Ciencia y Tecnología BIO2004-0270EMinisterio de Ciencia y Tecnología TIN2006-1287

idUS. Depósito de Investigación Universidad de Sevilla

Fuzzy clustering of univariate and multivariate time series by genetic multiobjective optimization

Author: BARAGONA Roberto
S. BANDYOPADHYAY
U. MAULIK
Publication venue: country:DEU
Publication date: 01/01/2010
Field of study

COMISEF Working Papers Series WPS-028 08/02/2010 URL: http://comisef.eu/files/wps028.pd

Archivio della ricerca- Università di Roma La Sapienza

Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensemble: A Survey

Author: Sandhya Rani Yalamarthi Leela
Satyanarayana K. V. V.
Sucharita V.
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/08/2018
Field of study

Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods

ZENODO

Institute of Advanced Engineering and Science