167 research outputs found

    Computation in Complex Networks

    Get PDF
    Complex networks are one of the most challenging research focuses of disciplines, including physics, mathematics, biology, medicine, engineering, and computer science, among others. The interest in complex networks is increasingly growing, due to their ability to model several daily life systems, such as technology networks, the Internet, and communication, chemical, neural, social, political and financial networks. The Special Issue “Computation in Complex Networks" of Entropy offers a multidisciplinary view on how some complex systems behave, providing a collection of original and high-quality papers within the research fields of: • Community detection • Complex network modelling • Complex network analysis • Node classification • Information spreading and control • Network robustness • Social networks • Network medicin

    Algorithmique des réseaux socio-sémantiques pour la visualisation par points de vue des communautés en ligne

    No full text
    Within the community detection problem it is possible to use either the structural dimension or the composition dimension of the social network; on the first case the communities contain groups of well-connected and dissimilar nodes whereas on the second case, the communities contain groups of similar but loosely connected nodes. Therefore the amount of information extracted is reduced as one of the dimensions is discarded. The objective of this Thesis is to propose a novel approach for detecting communities in which the structural and composition dimensions are integrated in such a way the communities contain groups of well-connected and similar nodes. This approach requires first, a new definition of community that includes both dimensions of the network, then a new community detection model suited for this new definition that allows us to find groups of well-connected and similar nodes. The model starts introducing the notion of point of view that allows the division of the composition dimension for analyzing the network from different perspectives. Then the model influences the community detection process by integrating the composition information into the graph structure. The last step is the social network visualization that places the nodes according to their structural and compositional similarities and that allows us to find important nodes regarding the interaction between communities.Dans le problème de détection de communautés il est possible d'utiliser soit la dimension structurelle, soit la dimension compositionelle du réseau : dans le premier cas les communautés seraient composées par des groupes de noeuds fortement connectés mais peu similaires, et pour le deuxième cas, les groupes auraient des noeuds similaires mais faiblement connectés. Donc en ne choisissant qu'une des dimensions la quantité possible d'information à extraire est réduite. Cette thèse a pour objectif de proposer une nouvelle approche pour utiliser en même temps les dimensions structurelle et compositionelle lors de la détection de communautés de façon telle que les groupes aient des noeuds similaires et bien connectés. Pour la mise en oeuvre de cette approche il faut d'abord une nouvelle définition de communauté qui prend en compte les deux dimensions présentées auparavant et ensuite un modèle nouveau de détection qui utilise cette définition, en trouvant des groupes de noeuds similaires et bien connectés. Le modèle commence par l'introduction de la notion de point de vue qui permet de diviser la dimension compositionelle pour analyser le réseau depuis différentes perspectives. Ensuite le modèle, en utilisant l'information compositionelle, influence le processus de détection de communautés qui intègre les deux dimensions du réseau. La dernière étape est la visualisation du graphe de communautés qui positionne les noeuds selon leur similarité structurelle et compositionelle, ce qui permet d'identifier des noeuds importants pour les interactions entre communautés

    Détection de communautés dans les réseaux d'information utilisant liens et attributs

    Get PDF
    Alors que les réseaux sociaux s'attachent à représenter des entités et les relations existant entre elles, les réseaux d'information intègrent également des attributs décrivant ces entités ; ce qui conduit à revisiter les méthodes d'analyse et de fouille de ces réseaux. Dans ces travaux, nous proposons des méthodes de classification des entités du réseau d'information qui exploitent d'une part les relations entre celles-ci et d'autre part les attributs les caractérisant. Nous nous penchons sur le cas des réseaux à vecteurs d'attributs, où les entités du réseau sont décrites par des vecteurs numériques. Ainsi nous proposons des approches basées sur des techniques reconnues pour chaque type d'information, faisant appel notamment à l'inertie pour la classification automatique et à la modularité de Newman et Girvan pour la détection de communautés. Nous évaluons nos propositions sur des réseaux issus de données bibliographiques, faisant usage en particulier d'information textuelle. Nous évaluons également nos approches face à diverses évolutions du réseau, notamment au regard d'une détérioration des informations des liens et des attributs, et nous caractérisons la robustesse de nos méthodes à celle-ciWhile social networks use to represent entities and relationships between them, information networks also include attributes describing these entities, leading to review the analysis and mining methods for these networks. In this work, we discuss classification of the entities in an information network. Classification operate simultaneously on the relationships and on the attributes characterizing the entities. We look at the case of attributed graphs where entities are described by numerical feature vectors. We propose approaches based on proven classification techniques for each type of information, including the inertia for machine learning and Newman and Girvan's modularity for community detection. We evaluate our proposals on networks from bibliographic data, using textual information. We also evaluate our methods against various changes in the network, such as a deterioration of the relational or vector data, mesuring the robustness of our methods to themST ETIENNE-Bib. électronique (422189901) / SudocSudocFranceF

    Computational Labeling, Partitioning, and Balancing of Molecular Networks

    Get PDF
    Recent advances in high throughput techniques enable large-scale molecular quantification with high accuracy, including mRNAs, proteins and metabolites. Differential expression of these molecules in case and control samples provides a way to select phenotype-associated molecules with statistically significant changes. However, given the significance ranking list of molecular changes, how those molecules work together to drive phenotype formation is still unclear. In particular, the changes in molecular quantities are insufficient to interpret the changes in their functional behavior. My study is aimed at answering this question by integrating molecular network data to systematically model and estimate the changes of molecular functional behaviors. We build three computational models to label, partition, and balance molecular networks using modern machine learning techniques. (1) Due to the incompleteness of protein functional annotation, we develop AptRank, an adaptive PageRank model for protein function prediction on bilayer networks. By integrating Gene Ontology (GO) hierarchy with protein-protein interaction network, our AptRank outperforms four state-of-the-art methods in a comprehensive evaluation using benchmark datasets. (2) We next extend our AptRank into a network partitioning method, BioSweeper, to identify functional network modules in which molecules share similar functions and also densely connect to each other. Compared to traditional network partitioning methods using only network connections, BioSweeper, which integrates the GO hierarchy, can automatically identify functionally enriched network modules. (3) Finally, we conduct a differential interaction analysis, namely difFBA, on protein-protein interaction networks by simulating protein fluxes using flux balance analysis (FBA). We test difFBA using quantitative proteomic data from colon cancer, and demonstrate that difFBA offers more insights into functional changes in molecular behavior than does protein quantity changes alone. We conclude that our integrative network model increases the observational dimensions of complex biological systems, and enables us to more deeply understand the causal relationships between genotypes and phenotypes

    Hierarchical graphs and oscillator dynamics.

    Get PDF
    In many types of network, the relationship between structure and function is of great significance. This work is particularly concerned with community structures, which arise in a wide variety of domains. A simple oscillator model is applied to networks with community structures and shows that waves of regular oscillation are caused by synchronised clusters of nodes. Moreover, we demonstrate that such global oscillations may arise as a direct result of network topology. We also observe that additional modes of oscillation (as detected through frequency analysis) occur in networks with additional levels of hierarchy and that such modes may be directly related to network structure. This method is applied in two specific domains (metabolic networks and metropolitan transport), demonstrating the robustness of the results when applied to real world systems. A topological analysis is also applied to the real world networks of metabolism and metropolitan transport using standard graphical measures. This yields a new artificial network growth model, which agrees closely with the graphical measures taken on metabolic pathway networks. This new model demonstrates a simple mechanism to produce the particular features found in these networks. We conclude that (where the distribution of oscillator frequencies and the interactions between them are known to be unimodal) the observations may be applicable to the detection of underlying community structure in networks, shedding further light on the general relationship between structure and function in complex systems

    Aspects of Spatial Trajectory Data Management–Compression and Clustering

    Get PDF

    11th SC@RUG 2014 proceedings:Student Colloquium 2013-2014

    Get PDF

    Evaluation of clustering results and novel cluster algorithms

    Get PDF
    Cluster analysis is frequently performed in many application fields to find groups in data. For example, in medicine, researchers have used gene expression data to cluster patients suffering from a particular disease (e.g., breast cancer), in order to detect new disease subtypes. Many cluster algorithms and methods for cluster validation, i.e., methods for evaluating the quality of cluster analysis results, have been proposed in the literature. However, open questions about the evaluation of both clustering results and novel cluster algorithms remain. It has rarely been discussed whether a) interesting clustering results or b) promising performance evaluations of newly presented cluster algorithms might be over-optimistic, in the sense that these good results cannot be replicated on new data or in other settings. Such questions are relevant in light of the so-called "replication crisis"; in various research disciplines such as medicine, biology, psychology, and economics, many results have turned out to be non-replicable, casting doubt on the trustworthiness and reliability of scientific findings. This crisis has led to increasing popularity of "metascience". Metascientific studies analyze problems that have contributed to the replication crisis (e.g., questionable research practices), and propose and evaluate possible solutions. So far, metascientific studies have mainly focused on issues related to significance testing. In contrast, this dissertation addresses the reliability of a) clustering results in applied research and b) results concerning newly presented cluster algorithms in the methodological literature. Different aspects of this topic are discussed in three Contributions. The first Contribution presents a framework for validating clustering results on validation data. Using validation data is vital to examine the replicability and generalizability of results. While applied researchers sometimes use validation data to check their clustering results, our article is the first to review the different approaches in the literature and to structure them in a systematic manner. We demonstrate that many classical cluster validation techniques, such as internal and external validation, can be combined with validation data. Our framework provides guidance to applied researchers who wish to evaluate their own clustering results or the results of other teams on new data. The second Contribution applies the framework from Contribution 1 to quantify over-optimistic bias in the context of a specific application field, namely unsupervised microbiome research. We analyze over-optimism effects which result from the multiplicity of analysis strategies for cluster analysis and network learning. The plethora of possible analysis strategies poses a challenge for researchers who are often uncertain about which method to use. Researchers might be tempted to try different methods on their dataset and look for the method yielding the "best" result. If only the "best" result is selectively reported, this may cause "overfitting" of the method to the dataset and the result might not be replicable on validation data. We quantify such over-optimism effects for four illustrative types of unsupervised research tasks (clustering of bacterial genera, hub detection in microbial association networks, differential network analysis, and clustering of samples). Contributions 1 and 2 consider the evaluation of clustering results and thus adopt a metascientific perspective on applied research. In contrast, the third Contribution is a metascientific study about methodological research on the development of new cluster algorithms. This Contribution analyzes the over-optimistic evaluation and reporting of novel cluster algorithms. As an illustrative example, we consider the recently proposed cluster algorithm "Rock"; initially deemed promising, it later turned out to be not generally better than its competitors. We demonstrate how Rock can nevertheless appear to outperform competitors via optimization of the evaluation design, namely the used data types, data characteristics, the algorithm’s parameters, and the choice of competing algorithms. The study is a cautionary tale that illustrates how easy it can be for researchers to claim apparent "superiority" of a new cluster algorithm. This, in turn, stresses the importance of strategies for avoiding the problems of over-optimism, such as neutral benchmark studies

    11th SC@RUG 2014 proceedings:Student Colloquium 2013-2014

    Get PDF
    • …
    corecore