1,638 research outputs found
How threshold behaviour affects the use of subgraphs for network comparison
Motivation: A wealth of proteinâprotein interaction (PPI) data has recently become available. These data are organized as PPI networks and an efficient and biologically meaningful method to compare such PPI networks is needed. As a first step, we would like to compare observed networks to established network models, under the aspect of small subgraph counts, as these are conjectured to relate to functional modules in the PPI network. We employ the software tool GraphCrunch with the Graphlet Degree Distribution Agreement (GDDA) score to examine the use of such counts for network comparison
Beyond clustering: mean-field dynamics on networks with arbitrary subgraph composition
Clustering is the propensity of nodes that share a common neighbour to be connected. It is ubiquitous in many networks but poses many modelling challenges. Clustering typically manifests itself by a higher than expected frequency of triangles, and this has led to the principle of constructing networks from such building blocks. This approach has been generalised to networks being constructed from a set of more exotic subgraphs. As long as these are fully connected, it is then possible to derive mean-field models that approximate epidemic dynamics well. However, there are virtually no results for non-fully connected subgraphs. In this paper, we provide a general and automated approach to deriving a set of ordinary differential equations, or mean-field model, that describes, to a high degree of accuracy, the expected values of system-level quantities, such as the prevalence of infection. Our approach offers a previously unattainable degree of control over the arrangement of subgraphs and network characteristics such as classical node degree, variance and clustering. The combination of these features makes it possible to generate families of networks with different subgraph compositions while keeping classical network metrics constant. Using our approach, we show that higher-order structure realised either through the introduction of loops of different sizes or by generating networks based on different subgraphs but with identical degree distribution and clustering, leads to non-negligible differences in epidemic dynamics
How are topics born? Understanding the research dynamics preceding the emergence of new areas
The ability to promptly recognise new research trends is strategic for many stake- holders, including universities, institutional funding bodies, academic publishers and companies. While the literature describes several approaches which aim to identify the emergence of new research topics early in their lifecycle, these rely on the assumption that the topic in question is already associated with a number of publications and consistently referred to by a community of researchers. Hence, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. In this paper, we begin to address this challenge by performing a study of the dynamics preceding the creation of new topics. This study indicates that the emergence of a new topic is anticipated by a significant increase in the pace of collaboration between relevant research areas, which can be seen as the âparentsâ of the new topic. These initial findings (i) confirm our hypothesis that it is possible in principle to detect the emergence of a new topic at the embryonic stage, (ii) provide new empirical evidence supporting relevant theories in Philosophy of Science, and also (iii) suggest that new topics tend to emerge in an environment in which weakly interconnected research areas begin to cross-fertilise
On the basic computational structure of gene regulatory networks
Gene regulatory networks constitute the first layer of the cellular
computation for cell adaptation and surveillance. In these webs, a set of
causal relations is built up from thousands of interactions between
transcription factors and their target genes. The large size of these webs and
their entangled nature make difficult to achieve a global view of their
internal organisation. Here, this problem has been addressed through a
comparative study for {\em Escherichia coli}, {\em Bacillus subtilis} and {\em
Saccharomyces cerevisiae} gene regulatory networks. We extract the minimal core
of causal relations, uncovering the hierarchical and modular organisation from
a novel dynamical/causal perspective. Our results reveal a marked top-down
hierarchy containing several small dynamical modules for \textit{E. coli} and
\textit{B. subtilis}. Conversely, the yeast network displays a single but large
dynamical module in the middle of a bow-tie structure. We found that these
dynamical modules capture the relevant wiring among both common and
organism-specific biological functions such as transcription initiation,
metabolic control, signal transduction, response to stress, sporulation and
cell cycle. Functional and topological results suggest that two fundamentally
different forms of logic organisation may have evolved in bacteria and yeast.Comment: This article is published at Molecular Biosystems, Please cite as:
Carlos Rodriguez-Caso, Bernat Corominas-Murtra and Ricard V. Sole. Mol.
BioSyst., 2009, 5 pp 1617--171
Evaluation Measures for Hierarchical Classification: a unified view and novel approaches
Hierarchical classification addresses the problem of classifying items into a
hierarchy of classes. An important issue in hierarchical classification is the
evaluation of different classification algorithms, which is complicated by the
hierarchical relations among the classes. Several evaluation measures have been
proposed for hierarchical classification using the hierarchy in different ways.
This paper studies the problem of evaluation in hierarchical classification by
analyzing and abstracting the key components of the existing performance
measures. It also proposes two alternative generic views of hierarchical
evaluation and introduces two corresponding novel measures. The proposed
measures, along with the state-of-the art ones, are empirically tested on three
large datasets from the domain of text classification. The empirical results
illustrate the undesirable behavior of existing approaches and how the proposed
methods overcome most of these methods across a range of cases.Comment: Submitted to journa
SIS epidemic propagation on hypergraphs
Mathematical modeling of epidemic propagation on networks is extended to
hypergraphs in order to account for both the community structure and the
nonlinear dependence of the infection pressure on the number of infected
neighbours. The exact master equations of the propagation process are derived
for an arbitrary hypergraph given by its incidence matrix. Based on these,
moment closure approximation and mean-field models are introduced and compared
to individual-based stochastic simulations. The simulation algorithm, developed
for networks, is extended to hypergraphs. The effects of hypergraph structure
and the model parameters are investigated via individual-based simulation
results
apk2vec: Semi-supervised multi-view representation learning for profiling Android applications
Building behavior profiles of Android applications (apps) with holistic, rich
and multi-view information (e.g., incorporating several semantic views of an
app such as API sequences, system calls, etc.) would help catering downstream
analytics tasks such as app categorization, recommendation and malware analysis
significantly better. Towards this goal, we design a semi-supervised
Representation Learning (RL) framework named apk2vec to automatically generate
a compact representation (aka profile/embedding) for a given app. More
specifically, apk2vec has the three following unique characteristics which make
it an excellent choice for largescale app profiling: (1) it encompasses
information from multiple semantic views such as API sequences, permissions,
etc., (2) being a semi-supervised embedding technique, it can make use of
labels associated with apps (e.g., malware family or app category labels) to
build high quality app profiles, and (3) it combines RL and feature hashing
which allows it to efficiently build profiles of apps that stream over time
(i.e., online learning). The resulting semi-supervised multi-view hash
embeddings of apps could then be used for a wide variety of downstream tasks
such as the ones mentioned above. Our extensive evaluations with more than
42,000 apps demonstrate that apk2vec's app profiles could significantly
outperform state-of-the-art techniques in four app analytics tasks namely,
malware detection, familial clustering, app clone detection and app
recommendation.Comment: International Conference on Data Mining, 201
- âŠ