11 research outputs found
Recommended from our members
The role of graph entropy in fault localization and network evolution
The design of a communication network has a critical impact on its effectiveness at delivering service to the users of a large scale compute infrastructure. In particular, the reliability of such networks is increasingly vital in the modern world, as more and more of our commercial and social activity is conducted using digital platforms. Systems to assure service availability have been available since the emergence of Mainframes, with the System 360 in 1964, and although commercially widespread, the scientific understanding is not as deep as the problem warrants. The basic operating principle of most service assurance systems combines the gathering of status messages, which we term as events, with algorithms to deduce from the events where potential failures may be occurring. The algorithms to identify which events are causal, known as root cause analysis or fault localization, usually rely upon a detailed understanding of the network structure in order to determine those events that are most helpful in diagnosing and remediating a service threatening problem. The complex nature of root cause algorithms introduces scalability limits in terms of the number of events that can be processed per second. Unfortunately as networks grow, the volume of events produced continues to increase, often dramatically.
The dependence of root cause analysis algorithms on network structure presents a significant challenge as networks continue to grow in scale and complexity. As a consequence of this, and the growing reliance upon networks as part of the key fabric of the modern economy, the commercial importance and the scale of the engineering challenges are increasing significantly.
In this thesis I outline a novel approach to improving the scalability of event processing using a mathematical property of networks, graph entropy. In the first two papers described in this thesis, I apply an efficiently computable approximation of graph entropy to the problem of identifying important nodes in a network. In this context, importance is a measure of whether the failure of a node is more likely to result in a significant impact on the overall connectivity of the network, and therefore likely to lead to an interruption of service. I show that by ignoring events from unimportant network nodes it is possible to significantly reduce the event rate that a root cause algorithm needs to process. Further, I demonstrate that unimportant nodes produce very many events, but very few root causes. The consequence is that although some events relating to root causes are missed, this is compensated for by the reduction in overall event rate. This leads to a significant reduction of the event processing load on management systems, and therefore increases the effectiveness of current approaches to root cause analysis on large networks.
Analysis of the topology data used in the first two papers revealed interesting anomalies in the degree distribution of the network nodes. This motivated the later focus of my research to investigate how graph entropy and network design considerations could be applied to the dynamical evolution of networks structures, most commonly described using the Preferential Attachment model of Barabási and Albert. A common feature of a communication network is the presence of a constraint on the number of logical or physical connections a device can support. In the last of the three papers in the thesis I develop and present a constrained model of network evolution, which demonstrates better quantitative agreement with real world networks than the preferential attachment model. This model, developed using the continuum approach, still does not address a fundamental question of random networks as a model of network evolution. Why should a node’s degree influence the likelihood of it acquiring connections? In the same paper I attempt to answer that question by outlining a model that links vertex entropy to a node’s attachment probability. The model successfully reproduces some of the characteristics of preferential attachment, and illustrates the potential for entropic arguments in network science.
Put together, the two main bodies of work constitute a practical advance on the state of the art of fault localization, and a theoretical insight into the inner workings of dynamic networks. They open up a number of interesting avenues for further investigation
A Statistical Approach to Characterize and Detect Degradation Within the Barabasi-Albert Network
Social Network Analysis (SNA) is widely used by the intelligence community when analyzing the relationships between individuals within groups of interest. Hence, any tools that can be quantitatively shown to help improve the analyses are advantageous for the intelligence community. To date, there have been no methods developed to characterize a real world network as a Barabasi-Albert network which is a type of network with properties contained in many real-world networks. In this research, two newly developed statistical tests using the degree distribution and the L-moments of the degree distribution are proposed with application to classifying networks and detecting degradation within a network. The feasibility of these tests is shown by using the degree distribution for network and sub-network characterization of a selected scale-free real world networks. Further, sensitivity to the level of network degradation, via edge or node deletion, is examined with recommendation made as to the detectable size of degradation achievable by the statistical tests. Finally, the degree distribution of simulated Barabasi-Albert networks is investigated and results demonstrate that the theoretical distribution derived previously in the literature is not applicable to all network sizes. These results provide a foundation on which a statistically driven approach for network characterization can be built for network classification and monitoring
Integrated information increases with fitness in the evolution of animats
One of the hallmarks of biological organisms is their ability to integrate
disparate information sources to optimize their behavior in complex
environments. How this capability can be quantified and related to the
functional complexity of an organism remains a challenging problem, in
particular since organismal functional complexity is not well-defined. We
present here several candidate measures that quantify information and
integration, and study their dependence on fitness as an artificial agent
("animat") evolves over thousands of generations to solve a navigation task in
a simple, simulated environment. We compare the ability of these measures to
predict high fitness with more conventional information-theoretic processing
measures. As the animat adapts by increasing its "fit" to the world,
information integration and processing increase commensurately along the
evolutionary line of descent. We suggest that the correlation of fitness with
information integration and with processing measures implies that high fitness
requires both information processing as well as integration, but that
information integration may be a better measure when the task requires memory.
A correlation of measures of information integration (but also information
processing) and fitness strongly suggests that these measures reflect the
functional complexity of the animat, and that such measures can be used to
quantify functional complexity even in the absence of fitness data.Comment: 27 pages, 8 figures, one supplementary figure. Three supplementary
video files available on request. Version commensurate with published text in
PLoS Comput. Bio
Herramientas informáticas y de inteligencia artificial para el meta-análisis en la frontera entre la bioinformática y las ciencias jurídicas
[Resumen] Los modelos computacionales, conocidos por su acrónimo en idioma
Inglés como QSPR (Quantitative Structure-Property Relationships) pueden
usarse para predecir propiedades de sistemas complejos. Estas predicciones
representan una aplicación importante de las Tecnologías de la Información
y la Comunicación (TICs). La mayor relevancia es debido a la reducción de
costes de medición experimental en términos de tiempo, recursos humanos,
recursos materiales, y/o el uso de animales de laboratorio en ciencias biomoleculares,
técnicas, sociales y/o jurídicas.
Las Redes Neuronales Artificiales (ANNs) son una de las herramientas
informáticas más poderosas para buscar modelos QSPR. Para ello, las
ANNs pueden usar como variables de entrada (input) parámetros
numéricos que cuantifiquen información sobre la estructura del sistema.
Los parámetros conocidos como Índices Topológicos (TIs) se encuentran
entre los más versátiles.
Los TIs se calculan en Teoría de Grafos a partir de la representación de
cualquier sistema como una red de nodos interconectados; desde moléculas
a redes biológicas, tecnológicas, y sociales. Esta tesis tiene como primer
objetivo realizar una revisión y/o introducir nuevos TIs y software de
cálculo de TIs útiles como inputs de ANNs para el desarrollo de modelos
QSPR de redes bio-moleculares, biológicas, tecnológico-económicas y
socio-jurídicas. En ellas, por una parte, los nodos representan biomoléculas,
organismos, poblaciones, leyes tributarias o concausas de
delitos. Por otra parte, en la interacción TICs-Ciencias Biomoleculares-
Derecho se hace necesario un marco de seguridad jurídica que permita el
adecuado desarrollo de las TICs y sus aplicaciones en Ciencias Biomoleculares.
Por eso, el segundo objetivo de esta tesis es revisar el marco
jurídico-legal de protección de los modelos QSAR/QSPR de sistemas
moleculares.
El presente trabajo de investigación pretende demostrar la utilidad de
estos modelos para predecir características y propiedades de estos sistemas
complejos.[Resumo] Os modelos de ordenador coñecidos pola súas iniciais en inglés QSPR
(Quantitative Structure-Property Relationships) poden prever as
propiedades de sistemas complexos e reducir os custos experimentais en
termos de tempo, recursos humanos, materiais e/ou o uso de animais de
laboratorio nas ciencias biomoleculares, técnicas, e sociais.
As Redes Neurais Artificiais (ANNs) son unha das ferramentas máis
poderosas para buscar modelos QSPR. Para iso, as ANNs poden facer uso,
coma variables de entrada (input), dos parámetros numéricos da estrutura
do sistema chamados Índices Topolóxicos (TIs).
Os TI calcúlanse na teoría dos grafos a partir da representación do sistema
coma unha rede de nós conectados, incluíndo tanto moléculas coma redes
sociais e tecnolóxicas. Esta tese ten como obxectivo principal revisar e/ou
desenvolver novos TIs, programas de cálculo de TIs, e/ou modelos QSPR
facendo uso de ANNs para predicir redes bio-moleculares, biolóxicas,
económicas, e sociais ou xurídicas onde os nós representan moléculas
biolóxicas, organismos, poboacións, ou as leis fiscais ou as concausas dun
delito. Ademais, a interacción das TIC con as ciencias biolóxicas e
xurídicas necesita dun marco de seguridade xurídica que permita o bo
desenvolvemento das TIC e as súas aplicacións en Ciencias
Biomoleculares. Polo tanto, o segundo obxectivo desta tese é analizar o
marco xurídico e legal de protección dos modelos QSPR.
O presente traballo de investigación pretende demostrar a utilidade destes
modelos para predicir características e propiedades destes sistemas
complexos.[Abstract] QSPR (Quantitative Structure-Property Relationships) computer models
can predict properties of complex systems reducing experimental costs in
terms of time, human resources, material resources, and/or the use of
laboratory animals in bio-molecular, technical, and/or social sciences.
Artificial Neural Networks (ANNs) are one of the most powerful tools to
search QSPR models. For this, the ANNs may use as input variables
numerical parameters of the system structure called Topological Indices
(TIs).
The TIs are calculated in Graph Theory from a representation of any
system as a network of interconnected nodes, including molecules or social
and technological networks. The first aim of this thesis is to review and/or
develop new TIs, TIs calculation software, and QSPR models using ANNs
to predict bio-molecular, biological, commercial, social, and legal networks
where nodes represent bio-molecules, organisms, populations, products, tax
laws, or criminal causes. Moreover, the interaction of ICTs with
Biomolecular and law Sciences needs a legal security framework that
allows the proper development of ICTs and their applications in Biomolecular
Sciences. Therefore, the second objective of this thesis is to
review the legal framework and legal protection of QSPR techniques.
The present work of investigation tries to demonstrate the usefulness of
these models to predict characteristics and properties of these complex
systems
Neural function approximation on graphs: shape modelling, graph discrimination & compression
Graphs serve as a versatile mathematical abstraction of real-world phenomena in numerous scientific disciplines. This thesis is part of the Geometric Deep Learning subject area, a family of learning paradigms, that capitalise on the increasing volume of non-Euclidean data so as to solve real-world tasks in a data-driven manner. In particular, we focus on the topic of graph function approximation using neural networks, which lies at the heart of many relevant methods. In the first part of the thesis, we contribute to the understanding and design of Graph Neural Networks (GNNs). Initially, we investigate the problem of learning on signals supported on a fixed graph. We show that treating graph signals as general graph spaces is restrictive and conventional GNNs have limited expressivity. Instead, we expose a more enlightening perspective by drawing parallels between graph signals and signals on Euclidean grids, such as images and audio. Accordingly, we propose a permutation-sensitive GNN based on an operator analogous to shifts in grids and instantiate it on 3D meshes for shape modelling (Spiral Convolutions). Following, we focus on learning on general graph spaces and in particular on functions that are invariant to graph isomorphism. We identify a fundamental trade-off between invariance, expressivity and computational complexity, which we address with a symmetry-breaking mechanism based on substructure encodings (Graph Substructure Networks). Substructures are shown to be a powerful tool that provably improves expressivity while controlling computational complexity, and a useful inductive bias in network science and chemistry. In the second part of the thesis, we discuss the problem of graph compression, where we analyse the information-theoretic principles and the connections with graph generative models. We show that another inevitable trade-off surfaces, now between computational complexity and compression quality, due to graph isomorphism. We propose a substructure-based dictionary coder - Partition and Code (PnC) - with theoretical guarantees that can be adapted to different graph distributions by estimating its parameters from observations. Additionally, contrary to the majority of neural compressors, PnC is parameter and sample efficient and is therefore of wide practical relevance. Finally, within this framework, substructures are further illustrated as a decisive archetype for learning problems on graph spaces.Open Acces
Recommended from our members
Neural correlates of consciousness in the complexity of brain networks
How do we define consciousness? Besides philosophical endeavours, the development of modern neuroimaging techniques fostered a principled way of quantifying the neural correlates of consciousness. Acquiring and analysing resting-state functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) data, has allowed neuroscientists to noninvasively map the brain’s functional interactions (or functional connectivity). Based on data obtained during controlled loss of consciousness and in cases of patients with disorders of consciousness, it has now been suggested that multiple, functionally specialized/segregated areas need to interact and integrate information in order to support consciousness. Thus an emerging idea in neuroscience is that the brain needs to balance the coexistence of functional segregation and integration, a property often termed as brain complexity, in order to produce consciousness. A resulting hypothesis is that consciousness is abolished when the balance between segregation and integration is lost and brain complexity is attenuated.
In that regard, I use complexity of functional connectivity, an aggregate measure of segregation and integration, as a marker of consciousness. This effort consists of two parts. First, I provide evidence that complexity in the healthy, awake brain is critical in the sense that it reflects a critical balance of segregation and integration designed to support efficient information communication. In turn, I provide evidence that loss of consciousness is associated with decreased complexity i.e. that functional connectivity departs from the critical complexity of the healthy, awake brain towards a more segregated configuration.
The structure of this thesis follows accordingly. In the first experimental chapter (3), I show the importance of the critical balance of complexity in the healthy, awake brain by using a structure-to-function association model. Specifically, I show that complexity can be derived upon certain optimal, structural connections (computed as the Nash equilibrium between regions), which promote efficient communication in the brain from the regional to the whole-brain level.
Chapter 4 focuses on capturing alterations of complexity in cases of sedation, anaesthesia and disorders of consciousness. Specifically, I show that as one goes from the awake state to anaesthetic-induced unconsciousness and disorders of consciousness, functional connectivity becomes less complex and more segregated. A refined approach that quantifies complexity in different parts of the brain allowed me to see whether this reduction in complexity is more evident in specific regions and networks. Under this framework, at the regional level I provide evidence that sparsely connected regions linking different parts of the brain play a critical role in whole-brain complexity. At the network level I show the importance of the default mode network in whole-brain complexity.
Even during rest, the brain is not static and displays rich temporal dynamics. Thus it is not only the complexity at each snapshot of time but also how complexity changes across time that can help us understand loss of consciousness. In chapter 5 I use a dynamic framework to derive and characterize the dynamics of functional connectivity during loss of consciousness. In turn, I provide evidence that brains become less temporally complex as one goes from the awake state to anaesthetic-induced unconsciousness and disorders of consciousness.
Moreover, my goal is to see whether the principle of complexity reduction can be also applied to the developing brain. Towards this direction, in chapter 6 I use complexity on EEG connectivity data to examine anaesthetic-induced loss of consciousness in infants. Specifically, I show that complexity in anaesthetised infants aged 0-3 years is reduced compared to a state of emergence from anaesthesia, indicating its importance in supporting consciousness and brain function since infancy.
Taken together, these findings show that while the complexity of the healthy, awake brain during rest is critically configured, the unconscious brain is characterized by reduced complexity. Based on the results presented in this work, I propose that consciousness can be assessed on the basis of complexity of resting-state functional connectivity data
Optimisation and information-theoretic principles in multiplex networks.
PhD ThesesThe multiplex network paradigm has proven very helpful in the study of many real-world
complex systems, by allowing to retain full information about all the different possible kinds of
relationships among the elements of a system. As a result, new non-trivial structural patterns
have been found in diverse multi-dimensional networked systems, from transportation networks to
the human brain. However, the analysis of multiplex structural and dynamical properties often
requires more sophisticated algorithms and takes longer time to run compared to traditional
single network methods. As a consequence, relying on a multiplex formulation should be the
outcome of a trade-off between the level of information and the resources required to store it.
In the first part of the thesis, we address the problem of quantifying and comparing the
amount of information contained in multiplex networks. We propose an algorithmic informationtheoretic
approach to evaluate the complexity of multiplex networks, by assessing to which extent
a given multiplex representation of a system is more informative than a single-layer graph. Then,
we demonstrate that the same measure is able to detect redundancy in a multiplex network and
to obtain meaningful lower-dimensional representations of a system. We finally show that such
method allows us to retain most of the structural complexity of the original system as well as
the salient characteristics determining the behaviour of dynamical processes happening on it.
In the second part of the thesis, we shift the focus to the modelling and analysis of some structural
features of real-world multiplex systems throughout optimisation principles. We demonstrate
that Pareto optimal principles provide remarkable tools not only to model real-world
multiplex transportation systems but also to characterise the robustness of multiplex systems
against targeted attacks in the context of optimal percolation
Modelling and measurement in synthetic biology
Synthetic biology applies engineering principles to make progress in the study of complex
biological phenomena. The aim is to develop understanding through the praxis of
construction and design. The computational branch of this endeavour explicitly brings
the tools of abstraction and modularity to bear. This thesis pursues two distinct lines of
inquiry concerning the application of computational tools in the setting of synthetic biology.
One thread traces a narrative through multi-paradigm computational simulations,
interpretation of results, and quantification of biological order. The other develops computational
infrastructure for describing, simulating and discovering, synthetic genetic
circuits.
The emergence of structure in biological organisms, morphogenesis, is critically
important for understanding both normal and pathological development of tissues. Here,
we focus on epithelial tissues because models of two dimensional cellular monolayers
are computationally tractable. We use a vertex model that consists of a potential energy
minimisation process interwoven with topological changes in the graph structure of the
tissue. To make this interweaving precise, we define a language for propagators from
which an unambiguous description of the simulation methodology can be constructed.
The vertex model is then used to reproduce laboratory results of patterning in engineered
mammalian cells. The assertion that the claim of reproduction is justified is based on
a novel measure of structure on coloured graphs which we call path entropy. This
measure is then extended to the setting of continuous regions and used to quantify
the development of structure in house mouse (Mus musculus) embryos using three
dimensional segmented anatomical models.
While it is recognised that DNA can be considered a powerful computational
environment, it is far from obvious how to program with nucleic acids. Using rule-based
modelling of modular biological parts, we develop a method for discovering synthetic
genetic programs that meet a specification provided by the user. This method rests on
the concept of annotation as applied to rule-based programs. We begin with annotating
rules and proceed to generating entire rule-based programs from annotations themselves.
Building on those tools we describe an evolutionary algorithm for discovering genetic
circuits from specifications provided in terms of probability distributions. This strategy
provides a dual benefit: using stochastic simulation captures circuit behaviour at low
copy numbers as well as complex properties such as oscillations, and using standard
biological parts produces results that are implementable in the laboratory
Recommended from our members
Connectomics of extrasynaptic signalling: applications to the nervous system of Caenorhabditis elegans
Connectomics – the study of neural connectivity – is primarily concerned with the mapping and characterisation of wired synaptic links; however, it is well established that long-distance chemical signalling via extrasynaptic volume transmission is also critical to brain function. As these interactions are not visible in the physical structure of the nervous system, current approaches to connectomics are unable to capture them.
This work addresses the problem of missing extrasynaptic interactions by demonstrating for the first time that whole-animal volume transmission networks can be mapped from gene expression and ligand-receptor interaction data, and analysed as part of the connectome. Complete networks are presented for the monoamine systems of Caenorhabditis elegans, along with a representative sample of selected neuropeptide systems.
A network analysis of the synaptic (wired) and extrasynaptic (wireless) connectomes is presented which reveals complex topological properties, including extrasynaptic rich-club organisation with interconnected hubs distinct from those in the synaptic and gap junction networks, and highly significant multilink motifs pinpointing locations in the network where aminergic and neuropeptide signalling is likely to modulate synaptic activity. Thus, the neuronal connectome can be modelled as a multiplex network with synaptic, gap junction, and neuromodulatory layers representing inter-neuronal interactions with different dynamics and polarity. This represents a prototype for understanding how extrasynaptic signalling can be integrated into connectomics research, and provides a novel dataset for the development of multilayer network algorithms.This work was supported by the Medical Research Council (MRC)