14 research outputs found
Representing Conversations for Scalable Overhearing
Open distributed multi-agent systems are gaining interest in the academic
community and in industry. In such open settings, agents are often coordinated
using standardized agent conversation protocols. The representation of such
protocols (for analysis, validation, monitoring, etc) is an important aspect of
multi-agent applications. Recently, Petri nets have been shown to be an
interesting approach to such representation, and radically different approaches
using Petri nets have been proposed. However, their relative strengths and
weaknesses have not been examined. Moreover, their scalability and suitability
for different tasks have not been addressed. This paper addresses both these
challenges. First, we analyze existing Petri net representations in terms of
their scalability and appropriateness for overhearing, an important task in
monitoring open multi-agent systems. Then, building on the insights gained, we
introduce a novel representation using Colored Petri nets that explicitly
represent legal joint conversation states and messages. This representation
approach offers significant improvements in scalability and is particularly
suitable for overhearing. Furthermore, we show that this new representation
offers a comprehensive coverage of all conversation features of FIPA
conversation standards. We also present a procedure for transforming AUML
conversation protocol diagrams (a standard human-readable representation), to
our Colored Petri net representation
Logical Hidden Markov Models
Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov
models to deal with sequences of structured symbols in the form of logical
atoms, rather than flat characters.
This note formally introduces LOHMMs and presents solutions to the three
central inference problems for LOHMMs: evaluation, most likely hidden state
sequence and parameter estimation. The resulting representation and algorithms
are experimentally evaluated on problems from the domain of bioinformatics
Recommended from our members
Temporal and Relational Models for Causality: Representation and Learning
Discovering causal dependence is central to understanding the behavior of complex systems and to selecting actions that will achieve particular outcomes. The majority of work in this area has focused on propositional domains, where data instances are assumed to be independent and identically distributed (i.i.d.). However, many real-world domains are inherently relational, i.e., they consist of multiple types of entities that interact with each other, and temporal, i.e., they change over time. This thesis focuses on causal modeling for these more complex relational and temporal domains. This thesis provides an in-depth investigation of the properties of relational models and is extending their expressivity to include a temporal dimension. Specifically, we first investigate alternative ways to ground relational models, and we provide an in-depth analysis of the impact of alternative grounding semantics for feature construction, causal effect estimation, and model selection. Then, we extend relational models to represent discrete time. We generalize the theory of d-separation for this class of temporal and relational models. Finally, we provide a constraint-based algorithm, TRCD, to learn the structure of temporal relational models from data
Modelling Incremental Self-Repair Processing in Dialogue.
PhDSelf-repairs, where speakers repeat themselves, reformulate or restart what they are saying, are
pervasive in human dialogue. These phenomena provide a window into real-time human language
processing. For explanatory adequacy, a model of dialogue must include mechanisms that
account for them. Artificial dialogue agents also need this capability for more natural interaction
with human users. This thesis investigates the structure of self-repair and its function in the
incremental construction of meaning in interaction.
A corpus study shows how the range of self-repairs seen in dialogue cannot be accounted for
by looking at surface form alone. More particularly it analyses a string-alignment approach and
shows how it is insufficient, provides requirements for a suitable model of incremental context
and an ontology of self-repair function.
An information-theoretic model is developed which addresses these issues along with a system
that automatically detects self-repairs and edit terms on transcripts incrementally with minimal
latency, achieving state-of-the-art results. Additionally it is shown to have practical use in
the psychiatric domain.
The thesis goes on to present a dialogue model to interpret and generate repaired utterances
incrementally. When processing repaired rather than fluent utterances, it achieves the same
degree of incremental interpretation and incremental representation. Practical implementation
methods are presented for an existing dialogue system.
Finally, a more pragmatically oriented approach is presented to model self-repairs in a psycholinguistically
plausible way. This is achieved through extending the dialogue model to include
a probabilistic semantic framework to perform incremental inference in a reference resolution
domain.
The thesis concludes that at least as fine-grained a model of context as word-by-word is required
for realistic models of self-repair, and context must include linguistic action sequences
and information update effects. The way dialogue participants process self-repairs to make inferences
in real time, rather than filter out their disfluency effects, has been modelled formally and
in practical systems.Engineering and Physical Sciences Research Council (EPSRC)
Doctoral Training Account (DTA) scholarship from the School of Electronic Engineering and
Computer Science at Queen Mary University of London
Algebraic tools in phylogenomics.
En aquesta tesi interdisciplinar desenvolupem eines algebraiques per a problemes en filogenètica i genòmica.
Per estudiar l'evolució molecular de les espècies sovint s'usen models evolutius estocàstics. L'evolució es representa en un arbre (anomenat filogenètic) on les espècies actuals corresponen a fulles de l'arbre i els nodes interiors corresponen a ancestres comuns a elles. La longitud d'una branca de l'arbre representa la quantitat de mutacions que han ocorregut entre les dues espècies adjacents a la branca. Llavors l'evolució de seqüències d'ADN en aquestes espècies es modelitza amb un procés Markov ocult al llarg de l'arbre. Si el procés de Markov se suposa a temps continu, normalment s'assumeix que també és homogeni i, en tal cas, els paràmetres del model són les entrades d'una raó de mutació instantània i les longituds de les branques. Si el procés de Markov és a temps discret, llavors els paràmetres del model són les probabilitats condicionades de substitució de nucleòtids al llarg de l'arbre i no hi ha cap hipòtesi d'homogeneïtat. Aquests últims són els tipus de models que considerem en aquesta tesi i són, per tant, més generals que els de temps continu.
Des d'aquesta perspectiva s'estudien els problemes més bàsics de la filogenètica: donat un conjunt de seqüències d'ADN, com decidim quin és el model evolutiu més adequat? com inferim de forma eficient els paràmetres del model? I fins i tot, tal i com també hem provat en aquesta tesi, és possible que les espècies no hagin evolucionat seguint un sol arbre sinó una mescla d'arbres i llavors cal abordar aquestes preguntes en aquest cas més general. Per a models evolutius a temps continu i homogenis, s'ha proposat solucions diverses a aquestes preguntes al llarg de les últimes dècades. En aquesta tesi resolem aquests dos problemes per a models evolutius a temps discret usant tècniques algebraiques provinents d'àlgebra lineal, teoria de grups, geometria algebraica i estadística algebraica. A més a més, la nostra solució per al primer problema és vàlida també per a mescles filogenètiques.
Hem fet tests dels mètodes proposats en aquesta tesi sobre dades simulades i dades reals del projectes ENCODE (Encyclopedia Of DNA Elements). Per tal de provar els nostres mètodes hem donat algoritmes per a generar seqüències evolucionant sota un model a temps discret amb un nombre esperat de mutacions prefixat. I així mateix, hem demostrat que aquests algorismes generen totes les seqüències possibles (per la majoria de models). Els tests sobre dades simulades mostren que els mètodes proposats són molt acurats i els resultats sobre dades reals permeten corroborar hipòtesis prèviament formulades. Tots els mètodes proposats en aquesta tesi han estat implementats per a un nombre arbitrari d'espècies i estan disponibles públicament.In this thesis we develop interdisciplinary algebraic tools for genomic and phylogenetic problems.
To study the molecular evolution of species one often uses stochastic evolutionary models. The evolution is represented in a tree (called phylogenetic tree) whose leaves represent current species and whose internal nodes correspond to their common ancestors. The length of a branch of the tree represents the number of mutations that have occurred between the two species adjacent to the branch. Then ,the evolution of DNA sequences in these species is modeled with a hidden Markov process along the tree. If the Markov process is assumed to be continuous in time, it is usually assumed homogeneous as well and, if so, the model parameters are the instantaneous rate of mutation and the lengths of the branches. If the Markov process is discrete in time, then the model parameters are the conditional probabilities of nucleotide substitution along the tree and there is no assumption of homogeneity. The latter are the types of models we consider in this thesis and are therefore more general than the homogeneous continuous ones.
From this perspective we study the basic problems of phylogenetics: Given a set of DNA sequences, what is the evolutionary model that best fits the data? how can we efficiently infer the model parameters? Also, as we also checked in this thesis, it is possible that species have not evolved along a single tree but a mixture of trees so that we need to address these questions in this more general case. For continuous-time, homogeneous, evolutionary models, several solutions to these questions have been proposed during the last decades. In this thesis we solve these two problems for discrete-time evolutionary models, using algebraic techniques from linear algebra, group theory, algebraic geometry and algebraic statistics. In addition, our solution to the first problem is also valid for phylogenetic mixtures.
We have made tests of the methods proposed in this thesis on simulated and real data from ENCODE Project (Encyclopedia Of DNA Elements). To test our methods, we also provide algorithms to generate sequences evolving under discrete-time models with a given expected number of mutations. Even more, we have proved that these algorithms generate all possible sequences (for most models). Tests on simulated data show that the methods are very accurate and our results on real data confirm hypotheses previously formulated. All the methods in this thesis have been implemented for an arbitrary number of species and are publicly available.Postprint (published version