2 research outputs found
Re-pair for Trees
We introduce a new linear time compression algorithm, called 'Repair for Trees', which compresses ordered trees over a ranked alphabet using linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free string grammars and allow basic tree operations, like traversal along edges, to be executed without prior decompression. Our algorithm can be considered as a generalization of the 'Re-pair' algorithm developed by N. Jesper Larsson and Alistair Moffat in 2000. The latter algorithm is a dictionary-based compression algorithm for strings.
We also introduce a succinct coding which is specialized in further compressing the grammars generated by our algorithm. Thisis accomplished without loosing the ability do directly execute queries on this compressed representation of the input tree. Finally, we compare the grammars and output files generated by a prototype of the Re-pair for Trees algorithm with those of similar compression algorithms. The obtained results show that that our algorithm outperforms its competitors in terms of compression ratio, runtime and memory usage
Statistical modeling of oscillating biological networks for structure inference and experimental design
Oscillations lie at the core of many biological processes, from the cell cycle, to
circadian oscillations and developmental processes. They are essential to enable
organisms to adapt to varying conditions in environmental cycles, from day/night
to seasonal. Transcriptional regulatory networks are one of the mechanisms behind
these biological oscillations. One of the main problems of computational
systems biology is elucidating the interaction between biological components. A
common mathematical abstraction is to represent these interactions as networks
whose nodes are the reactive species and the interactions are edges. There is
abundant literature dealing with the reconstruction of the network structure from
steady-state gene expression measurements; still, there are lots of advancements
to be made because of the complex nature of biological systems. Experimental
design is another obstacle to overcome; we wish to perform experiments that help
us best define the network structure according to our current knowledge of the
system.
In the first chapters of this thesis we will focus on reconstructing the network
structure of biological oscillators by explicitly leveraging the cyclical nature of
the transcriptional signals. We present a method for reconstructing network interactions
tailored to this special but important class of genetic circuits. The
method is based on projecting the signal onto a set of oscillatory basis functions.
We build a Bayesian hierarchical model within a frequency domain linear model
in order to enforce sparsity and incorporate prior knowledge about the network
structure. Experiments on real and simulated data show that the method can
lead to substantial improvements over competing approaches if the oscillatory
assumption is met, and remains competitive also in cases it is not.
Having defined a model for gene expression in oscillatory systems, we also consider
the problem of designing informative experiments for elucidating the dynamics
and better identify the model. We demonstrate our approach on a benchmark
scenario in plant biology, the circadian clock network of Arabidopsis thaliana, and
discuss the different value of three types of commonly used experiments in terms
of aiding the reconstruction of the network.
Finally we provide the architecture and design of a software implementation to
plug in statistical methods of gene expression inference and network reconstruction
into a biological data integration platform