1,638 research outputs found

    How threshold behaviour affects the use of subgraphs for network comparison

    Get PDF
    Motivation: A wealth of protein–protein interaction (PPI) data has recently become available. These data are organized as PPI networks and an efficient and biologically meaningful method to compare such PPI networks is needed. As a first step, we would like to compare observed networks to established network models, under the aspect of small subgraph counts, as these are conjectured to relate to functional modules in the PPI network. We employ the software tool GraphCrunch with the Graphlet Degree Distribution Agreement (GDDA) score to examine the use of such counts for network comparison

    Beyond clustering: mean-field dynamics on networks with arbitrary subgraph composition

    Get PDF
    Clustering is the propensity of nodes that share a common neighbour to be connected. It is ubiquitous in many networks but poses many modelling challenges. Clustering typically manifests itself by a higher than expected frequency of triangles, and this has led to the principle of constructing networks from such building blocks. This approach has been generalised to networks being constructed from a set of more exotic subgraphs. As long as these are fully connected, it is then possible to derive mean-field models that approximate epidemic dynamics well. However, there are virtually no results for non-fully connected subgraphs. In this paper, we provide a general and automated approach to deriving a set of ordinary differential equations, or mean-field model, that describes, to a high degree of accuracy, the expected values of system-level quantities, such as the prevalence of infection. Our approach offers a previously unattainable degree of control over the arrangement of subgraphs and network characteristics such as classical node degree, variance and clustering. The combination of these features makes it possible to generate families of networks with different subgraph compositions while keeping classical network metrics constant. Using our approach, we show that higher-order structure realised either through the introduction of loops of different sizes or by generating networks based on different subgraphs but with identical degree distribution and clustering, leads to non-negligible differences in epidemic dynamics

    How are topics born? Understanding the research dynamics preceding the emergence of new areas

    Get PDF
    The ability to promptly recognise new research trends is strategic for many stake- holders, including universities, institutional funding bodies, academic publishers and companies. While the literature describes several approaches which aim to identify the emergence of new research topics early in their lifecycle, these rely on the assumption that the topic in question is already associated with a number of publications and consistently referred to by a community of researchers. Hence, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. In this paper, we begin to address this challenge by performing a study of the dynamics preceding the creation of new topics. This study indicates that the emergence of a new topic is anticipated by a significant increase in the pace of collaboration between relevant research areas, which can be seen as the ‘parents’ of the new topic. These initial findings (i) confirm our hypothesis that it is possible in principle to detect the emergence of a new topic at the embryonic stage, (ii) provide new empirical evidence supporting relevant theories in Philosophy of Science, and also (iii) suggest that new topics tend to emerge in an environment in which weakly interconnected research areas begin to cross-fertilise

    On the basic computational structure of gene regulatory networks

    Full text link
    Gene regulatory networks constitute the first layer of the cellular computation for cell adaptation and surveillance. In these webs, a set of causal relations is built up from thousands of interactions between transcription factors and their target genes. The large size of these webs and their entangled nature make difficult to achieve a global view of their internal organisation. Here, this problem has been addressed through a comparative study for {\em Escherichia coli}, {\em Bacillus subtilis} and {\em Saccharomyces cerevisiae} gene regulatory networks. We extract the minimal core of causal relations, uncovering the hierarchical and modular organisation from a novel dynamical/causal perspective. Our results reveal a marked top-down hierarchy containing several small dynamical modules for \textit{E. coli} and \textit{B. subtilis}. Conversely, the yeast network displays a single but large dynamical module in the middle of a bow-tie structure. We found that these dynamical modules capture the relevant wiring among both common and organism-specific biological functions such as transcription initiation, metabolic control, signal transduction, response to stress, sporulation and cell cycle. Functional and topological results suggest that two fundamentally different forms of logic organisation may have evolved in bacteria and yeast.Comment: This article is published at Molecular Biosystems, Please cite as: Carlos Rodriguez-Caso, Bernat Corominas-Murtra and Ricard V. Sole. Mol. BioSyst., 2009, 5 pp 1617--171

    Evaluation Measures for Hierarchical Classification: a unified view and novel approaches

    Full text link
    Hierarchical classification addresses the problem of classifying items into a hierarchy of classes. An important issue in hierarchical classification is the evaluation of different classification algorithms, which is complicated by the hierarchical relations among the classes. Several evaluation measures have been proposed for hierarchical classification using the hierarchy in different ways. This paper studies the problem of evaluation in hierarchical classification by analyzing and abstracting the key components of the existing performance measures. It also proposes two alternative generic views of hierarchical evaluation and introduces two corresponding novel measures. The proposed measures, along with the state-of-the art ones, are empirically tested on three large datasets from the domain of text classification. The empirical results illustrate the undesirable behavior of existing approaches and how the proposed methods overcome most of these methods across a range of cases.Comment: Submitted to journa

    SIS epidemic propagation on hypergraphs

    Get PDF
    Mathematical modeling of epidemic propagation on networks is extended to hypergraphs in order to account for both the community structure and the nonlinear dependence of the infection pressure on the number of infected neighbours. The exact master equations of the propagation process are derived for an arbitrary hypergraph given by its incidence matrix. Based on these, moment closure approximation and mean-field models are introduced and compared to individual-based stochastic simulations. The simulation algorithm, developed for networks, is extended to hypergraphs. The effects of hypergraph structure and the model parameters are investigated via individual-based simulation results

    apk2vec: Semi-supervised multi-view representation learning for profiling Android applications

    Full text link
    Building behavior profiles of Android applications (apps) with holistic, rich and multi-view information (e.g., incorporating several semantic views of an app such as API sequences, system calls, etc.) would help catering downstream analytics tasks such as app categorization, recommendation and malware analysis significantly better. Towards this goal, we design a semi-supervised Representation Learning (RL) framework named apk2vec to automatically generate a compact representation (aka profile/embedding) for a given app. More specifically, apk2vec has the three following unique characteristics which make it an excellent choice for largescale app profiling: (1) it encompasses information from multiple semantic views such as API sequences, permissions, etc., (2) being a semi-supervised embedding technique, it can make use of labels associated with apps (e.g., malware family or app category labels) to build high quality app profiles, and (3) it combines RL and feature hashing which allows it to efficiently build profiles of apps that stream over time (i.e., online learning). The resulting semi-supervised multi-view hash embeddings of apps could then be used for a wide variety of downstream tasks such as the ones mentioned above. Our extensive evaluations with more than 42,000 apps demonstrate that apk2vec's app profiles could significantly outperform state-of-the-art techniques in four app analytics tasks namely, malware detection, familial clustering, app clone detection and app recommendation.Comment: International Conference on Data Mining, 201
    • 

    corecore