329 research outputs found
Recommended from our members
Application of bio-model engineering to model abstract biological behaviours
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonLife in nature is defined by many characteristics. Whether something can move, communicate,
response to the others, reproduce or die, indicate if it is alive or not. Among these features,
communication can be considered the most basic and yet the most important as it happens both
inside and outside an organism; between every molecule and every cell there are signals to be
passed and to be responded to. Communication defines biology.
A network of molecules or a society of organisms are both complex systems. The smallest
change in this snarled network affects the whole system and changes the output significantly.
Comprehending and manipulating them in detail is time and resources consuming and involves
human error. But there is a way to simplify the process of inspecting the living creatures.
Bio-model engineering lies at the crossroads of biology, mathematics, computer science,
engineering and is a branch of systems biology. In this field of science, biological models are created
and/or re-designed for simplification, abstraction and description of biological networks. Modelling
these networks based on past experimental observations in silico with a set of pre-designed models
and a collection of components would make this process faster and simpler.
This thesis contributes to science by providing a collection of model components built in
Petri nets with Snoopy. These components each describe a specific behaviour and they can be
used individually or as a combination. The set of behaviours in this collection include chemotaxis,
reproduction, death, communication and response. These are a few of the most basic behaviours
in nature that mark something as alive. These basic behaviours choose that a piece of stone is
not alive but the small microscopic bacteria on it are.
Starting with small achievable steps, these components are modelled in abstract, meaning
they demonstrate only the critical parts of the behaviours. Not only the models, but also
the process of modelling and combining the components is provided from the adaptation and
manipulation of a general protocol.
The components in this library are categorised based on their complexity. In this categorisation,
the models have four levels, with each level more complex than the former. The
more complex levels, are built from the simpler ones in a hierarchical manner. There are two
application of the models to two different microorganisms, each from one of the main biological superkingdoms to demonstrate the practicality of this collection. The chosen microorganisms are
from: the domain of Prokaryotes E. coli and Eukaryotes Dictyostelium a.k.a slime mould.
Each model contains a set of rate constants that define the speed of the reactions. A set
of expected behaviours based on biological literature is defined for these models to be compared
with the outcome result of the analysis of the models. The models are simulated by Spike, a
command line programme for simulation of models built in Snoopy, and are analysed with R and
Python. To achieve the expected results, optimisation methods are used to find the best rates
possible in the models in order to achieve a defined behaviour. In this thesis the optimisation is
applied to Dictyostelium model to achieve the best rates for the accumulation of Dictyostelium
cells in one location to create fruiting bodies. Random Restart Hill Climbing and Simulated
Annealing are the chosen methods for optimisation
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
Finding the pitfalls in query performance
Despite their popularity, database benchmarks only highlight a small part of the capabilities of any given system. They do not necessarily highlight problematic components encountered in real life or provide hints for further research and engineering.In this paper we introduce discriminative performance benchmarking, which aids in exploring a larger search space to find performance outliers and their underlying cause. The approach is based on deriving a domain specific language from a sample query to identify a query workload. SQLscalpel subsequently explores the space using query morphing, and simulated annealing to find performance outliers, and the query components responsible. To speedup the exploration for often time-consuming experiments SQLscalpel has been designed to run asynchronously on a large cluster of machines.</p
Recommended from our members
Simultaneous modelling and clustering of visual field data
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonIn the health-informatics and bio-medical domains, clinicians produce an enormous amount of data which can be complex and high in dimensionality. This scenario includes visual field data, which are used for managing the second leading cause of blindness in the world: glaucoma. Visual field data are the most common type of data collected to diagnose glaucoma in patients, and usually the data consist of 54 or 76 variables (which are referred to as visual field locations). Due to the large number of variables, the six nerve fiber bundles (6NFB), which is a collection of visual field locations in groups, are the standard clusters used in visual field data to represent the physiological traits of the retina. However, with regard to classification accuracy of the data, this research proposes a technique to find other significant spatial clusters of visual field with higher classification accuracy than the 6NFB.
This thesis presents a novel clustering technique, namely, Simultaneous Modelling and Clustering (SMC). SMC performs clustering on data based on classification accuracy using heuristic search techniques. The method searches a collection of significant clusters of visual field locations that indicate visual field loss progression. The aim of this research is two-fold. Firstly, SMC algorithms are developed and tested on data to investigate the effectiveness and efficiency of the method using optimisation and classification methods. Secondly, a significant clustering arrangement of visual field, which highly interrelated visual field locations to represent progression of visual field loss with high classification accuracy, is searched to complement the 6NFB in diagnosis of glaucoma. A new clustering arrangement of visual field locations can be used by medical practitioners together with the 6NFB to complement each other in diagnosis of glaucoma in patients.
This research conducts extensive experiment work on both visual field and simulated data to evaluate the proposed method. The results obtained suggest the proposed method appears to be an effective and efficient method in clustering visual field data and
3
improving classification accuracy. The key contributions of this work are the novel model-based clustering of visual field data, effective and efficient algorithms for SMC, practical knowledge of visual field data in the diagnosis of glaucoma and the presentation a generic framework for modelling and clustering which is highly applicable to many other dataset/model combinations
Influence modelling and learning between dynamic bayesian networks using score-based structure learning
A Ph.D. thesis submitted to the Faculty of Science, University of the Witwatersrand,
in fulfillment of the requirements for the degree of Doctor of Philosophy in Computer
Science
May 2018Although partially observable stochastic processes are ubiquitous in many fields of science,
little work has been devoted to discovering and analysing the means by which several such
processes may interact to influence each other. In this thesis we extend probabilistic structure
learning between random variables to the context of temporal models which represent
partially observable stochastic processes. Learning an influence structure and distribution
between processes can be useful for density estimation and knowledge discovery.
A common approach to structure learning, in observable data, is score-based structure
learning, where we search for the most suitable structure by using a scoring metric to value
structural configurations relative to the data. Most popular structure scores are variations on
the likelihood score which calculates the probability of the data given a potential structure.
In observable data, the decomposability of the likelihood score, which is the ability to
represent the score as a sum of family scores, allows for efficient learning procedures and
significant computational saving. However, in incomplete data (either by latent variables or
missing samples), the likelihood score is not decomposable and we have to perform
inference to evaluate it. This forces us to use non-linear optimisation techniques to optimise
the likelihood function. Furthermore, local changes to the network can affect other parts of
the network, which makes learning with incomplete data all the more difficult.
We define two general types of influence scenarios: direct influence and delayed influence
which can be used to define influence around richly structured spaces; consisting of
multiple processes that are interrelated in various ways. We will see that although it is
possible to capture both types of influence in a single complex model by using a setting of
the parameters, complex representations run into fragmentation issues. This is handled by
extending the language of dynamic Bayesian networks to allow us to construct single
compact models that capture the properties of a system’s dynamics, and produce influence
distributions dynamically.
The novelty and intuition of our approach is to learn the optimal influence structure in
layers. We firstly learn a set of independent temporal models, and thereafter, optimise a
structure score over possible structural configurations between these temporal models. Since
the search for the optimal structure is done using complete data we can take advantage of
efficient learning procedures from the structure learning literature. We provide the
following contributions: we (a) introduce the notion of influence between temporal models;
(b) extend traditional structure scores for random variables to structure scores for temporal
models; (c) provide a complete algorithm to recover the influence structure between
temporal models; (d) provide a notion of structural assembles to relate temporal models for
types of influence; and finally, (e) provide empirical evidence for the effectiveness of our
method with respect to generative ground-truth distributions.
The presented results emphasise the trade-off between likelihood of an influence structure to
the ground-truth and the computational complexity to express it. Depending on the
availability of samples we might choose different learning methods to express influence
relations between processes. On one hand, when given too few samples, we may choose to
learn a sparse structure using tree-based structure learning or even using no influence
structure at all. On the other hand, when given an abundant number of samples, we can use
penalty-based procedures that achieve rich meaningful representations using local search
techniques.
Once we consider high-level representations of dynamic influence between temporal models,
we open the door to very rich and expressive representations which emphasise the
importance of knowledge discovery and density estimation in the temporal setting.MT 201
A simulation modelling approach to improve the OEE of a bottling line
This dissertation presents a simulation approach to improve the efficiency performance, in terms of OEE, of an automated bottling line. A simulation model of the system is created by means of the software AnyLogic; it is used to solve the case. The problems faced are a sequencing problem related to the order the formats of bottles are processed and the buffer sizing problem. Either theoretical aspects on OEE, job sequencing and simulation and practical aspects are presented
Optimización del diseño estructural de pavimentos asfálticos para calles y carreteras
gráficos, tablasThe construction of asphalt pavements in streets and highways is an activity that requires optimizing the consumption of significant economic and natural resources. Pavement design optimization meets contradictory objectives according to the availability of resources and users’ needs. This dissertation explores the application of metaheuristics to optimize the design of asphalt pavements using an incremental design based on the prediction of damage and vehicle operating costs (VOC). The costs are proportional to energy and resource consumption and polluting emissions. The evolution of asphalt pavement design and metaheuristic optimization techniques on this topic were reviewed. Four computer programs were developed: (1) UNLEA, a program for the structural analysis of multilayer systems. (2) PSO-UNLEA, a program that uses particle swarm optimization metaheuristic (PSO) for the backcalculation of pavement moduli. (3) UNPAVE, an incremental pavement design program based on the equations of the North American MEPDG and includes the computation of vehicle operating costs based on IRI. (4) PSO-PAVE, a PSO program to search for thicknesses that optimize the design considering construction and vehicle operating costs. The case studies show that the backcalculation and structural design of pavements can be optimized by PSO considering restrictions in the thickness and the selection of materials. Future developments should reduce the computational cost and calibrate the pavement performance and VOC models. (Texto tomado de la fuente)La construcción de pavimentos asfálticos en calles y carreteras es una actividad que requiere la optimización del consumo de cuantiosos recursos económicos y naturales. La optimización del diseño de pavimentos atiende objetivos contradictorios de acuerdo con la disponibilidad de recursos y las necesidades de los usuarios. Este trabajo explora el empleo de metaheurísticas para optimizar el diseño de pavimentos asfálticos empleando el diseño incremental basado en la predicción del deterioro y los costos de operación vehicular (COV). Los costos son proporcionales al consumo energético y de recursos y las emisiones contaminantes. Se revisó la evolución del diseño de pavimentos asfálticos y el desarrollo de técnicas metaheurísticas de optimización en este tema. Se desarrollaron cuatro programas de computador: (1) UNLEA, programa para el análisis estructural de sistemas multicapa. (2) PSO-UNLEA, programa que emplea la metaheurística de optimización con enjambre de partículas (PSO) para el cálculo inverso de módulos de pavimentos. (3) UNPAVE, programa de diseño incremental de pavimentos basado en las ecuaciones de la MEPDG norteamericana, y el cálculo de costos de construcción y operación vehicular basados en el IRI. (4) PSO-PAVE, programa que emplea la PSO en la búsqueda de espesores que permitan optimizar el diseño considerando los costos de construcción y de operación vehicular. Los estudios de caso muestran que el cálculo inverso y el diseño estructural de pavimentos pueden optimizarse mediante PSO considerando restricciones en los espesores y la selección de materiales. Los desarrollos futuros deben enfocarse en reducir el costo computacional y calibrar los modelos de deterioro y COV.DoctoradoDoctor en Ingeniería - Ingeniería AutomáticaDiseño incremental de pavimentosEléctrica, Electrónica, Automatización Y Telecomunicacione
- …