Search CORE

234 research outputs found

Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs

Author: A Banerjee
A Clauset
A Misbahuddin
A Ruepp
AK Jain
AL Barabási
AN Langville
AP Erdös
B Long
CJ Sylvester
D Lee
D Lee
D Zhou
E Hüllermeier
E Ravasz
Fabian J Theis
Florian Blöchl
G Karypis
G Palla
H Cho
I Dhillon
J Bezdek
J Dunn
JB MacQueen
JB Pereira-Leal
K Devarajan
KI Goh
KV Mardia
M Barber
M Campos
M Fiorio
MA Yildirim
Mara L Hartsperger
N Gulbahce
P Paatero
P Wong
R Montanez
RC Samaco
RJ Shprintzen
RR Lebel
S Bauer
S Klamt
S Maslov
T Barnickel
Volker Stümpflen
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of <it>k</it>-partite graphs. These graphs contain <it>k </it>different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. Results Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a <it>k</it>-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted <it>k</it>-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. Conclusions In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy <it>k</it>-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.</p

Springer - Publisher Connector

Directory of Open Access Journals

PuSH

Recommended from our members

Multi-objective community detection applied to social and COVID-19 constructed networks

Author: Ahmed Jenan Moosa
Publication venue: Brunel University London
Publication date: 01/01/2022
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonCommunity Detection plays an integral part in network analysis, as it facilitates understanding the structures and functional characteristics of the network. Communities organize real-world networks into densely connected groups of nodes. This thesis provides a critical analysis of the Community Detection and highlights the main areas including algorithms, evaluation metrics, applications, and datasets in social networks. After defining the research gap, this thesis proposes two Attribute-Based Label Propagation algorithms that maximizes both Modularity and homogeneity. Homogeneity is considered as an objective function one time, and as a constraint another time. To better capture the homogeneity of real-world networks, a new Penalized Homogeneity degree (PHd) is proposed, that can be easily personalized based on the network characteristics. For the first time, COVID-19 tracing data are utilized to form two dataset networks: one is based on the virus transition between the world countries. While the second dataset is an attributed network based on the virus transition among the contact-tracing in the Kingdom of Bahrain. This type of networks that is concerned in tracking a disease was not formed based on COVID-19 virus and has never been studied as a community detection problem. The proposed datasets are validated and tested in several experiments. The proposed Penalized Homogeneity measure is personalized and used to evaluate the proposed attributed network. Extensive experiments and analysis are carried out to evaluate the proposed methods and benchmark the results with other well-known algorithms. The results are compared in terms of Modularity, proposed PHd, and accuracy measures. The proposed methods have achieved maximum performance among other methods, with 26.6% better performance in Modularity, and 33.96% in PHd on the proposed dataset, as well as noteworthy results on benchmarking datasets with improvement in Modularity measures of 7.24%, and 4.96% respectively, and proposed PHd values 27% and 81.9%

Brunel University Research Archive

Unsupervised Algorithms for Microarray Sample Stratification

Author: Cattelani Luca
Federico Antonio
Fratello Michele
Greco Dario
Pavel Alisa
Scala Giovanni
Serra Angela
Publication venue: Springer, UK
Publication date: 01/01/2022
Field of study

The amount of data made available by microarrays gives researchers the opportunity to delve into the complexity of biological systems. However, the noisy and extremely high-dimensional nature of this kind of data poses significant challenges. Microarrays allow for the parallel measurement of thousands of molecular objects spanning different layers of interactions. In order to be able to discover hidden patterns, the most disparate analytical techniques have been proposed. Here, we describe the basic methodologies to approach the analysis of microarray datasets that focus on the task of (sub)group discovery.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

COMMUNITY DETECTION IN GRAPHS

Author: Gao Zheng
Publication venue: [Bloomington, Ind.] : Indiana University
Publication date: 01/06/2020
Field of study

Thesis (Ph.D.) - Indiana University, Luddy School of Informatics, Computing, and Engineering/University Graduate School, 2020Community detection has always been one of the fundamental research topics in graph mining. As a type of unsupervised or semi-supervised approach, community detection aims to explore node high-order closeness by leveraging graph topological structure. By grouping similar nodes or edges into the same community while separating dissimilar ones apart into different communities, graph structure can be revealed in a coarser resolution. It can be beneficial for numerous applications such as user shopping recommendation and advertisement in e-commerce, protein-protein interaction prediction in the bioinformatics, and literature recommendation or scholar collaboration in citation analysis. However, identifying communities is an ill-defined problem. Due to the No Free Lunch theorem [1], there is neither gold standard to represent perfect community partition nor universal methods that are able to detect satisfied communities for all tasks under various types of graphs. To have a global view of this research topic, I summarize state-of-art community detection methods by categorizing them based on graph types, research tasks and methodology frameworks. As academic exploration on community detection grows rapidly in recent years, I hereby particularly focus on the state-of-art works published in the latest decade, which may leave out some classic models published decades ago. Meanwhile, three subtle community detection tasks are proposed and assessed in this dissertation as well. First, apart from general models which consider only graph structures, personalized community detection considers user need as auxiliary information to guide community detection. In the end, there will be fine-grained communities for nodes better matching user needs while coarser-resolution communities for the rest of less relevant nodes. Second, graphs always suffer from the sparse connectivity issue. Leveraging conventional models directly on such graphs may hugely distort the quality of generate communities. To tackle such a problem, cross-graph techniques are involved to propagate external graph information as a support for target graph community detection. Third, graph community structure supports a natural language processing (NLP) task to depict node intrinsic characteristics by generating node summarizations via a text generative model. The contribution of this dissertation is threefold. First, a decent amount of researches are reviewed and summarized under a well-defined taxonomy. Existing works about methods, evaluation and applications are all addressed in the literature review. Second, three novel community detection tasks are demonstrated and associated models are proposed and evaluated by comparing with state-of-art baselines under various datasets. Third, the limitations of current works are pointed out and future research tracks with potentials are discussed as well

IUScholarWorks (University of Indiana)

Recent advances in clustering methods for protein interaction networks

Author: Deng Youping
Li Min
Pan Yi
Wang Jianxin
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

The increasing availability of large-scale protein-protein interaction data has made it possible to understand the basic components and organization of cell machinery from the network level. The arising challenge is how to analyze such complex interacting data to reveal the principles of cellular organization, processes and functions. Many studies have shown that clustering protein interaction network is an effective approach for identifying protein complexes or functional modules, which has become a major research topic in systems biology. In this review, recent advances in clustering methods for protein interaction networks will be presented in detail. The predictions of protein functions and interactions based on modules will be covered. Finally, the performance of different clustering methods will be compared and the directions for future research will be discussed

Springer - Publisher Connector

Theory of preference modelling for communities in scale-free networks

Author: Dhama Sakshi
Dombi József
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study