Software agents interact to solve tasks, the details of which need to be described
in a language understandable by all the actors involved. Ontologies provide a formalism
for defining both the domain of the task and the terminology used to describe
it. However, finding a shared ontology has proved difficult: different institutions and
developers have different needs and formalise them in different ontologies.
In a closed environment it is possible to force all the participants to share the
same ontology, while in open and distributed environments ontology mapping can provide
interoperability between heterogeneous interacting actors. However, conventional
mapping systems focus on acquiring static information, and on mapping whole ontologies,
which is infeasible in open systems.
This thesis shows a different approach to the problem of heterogeneity. It starts
from the intuitive idea that when similar situations arise, similar interactions are performed.
If the interactions between actors are specified in formal scripts, shared by
all the participants, then when the same situation arises, the same script is used. The
main hypothesis that this thesis aims to demonstrate is that by analysing different runs
of these scripts it is possible to create a statistical model of the interactions, that reflect
the frequency of terms in messages and of ontological relations between terms
in different messages. The model is then used during a run of a known interaction to
compute the probability distribution for terms in received messages. The probability
distribution provides additional information, contextual to the interaction, that can be
used by a traditional ontology matcher in order to improve efficiency, by reducing the
comparisons to the most likely ones given the context, and possibly both recall and
precision, in particular helping disambiguation.
The ability to create a model that reflects real phenomena in this sort of environment
is evaluated by analysing the quality of the predictions, in particular verifying
how various features of the interactions, such as their non-stationarity, affect the predictions.
The actual improvements to a matcher we developed are also evaluated. The
overall results are very promising, as using the predictor can lower the overall computation
time for matching by ten times, while maintaining or in some cases improving
recall and precision