22,773 research outputs found
Word-to-Word Models of Translational Equivalence
Parallel texts (bitexts) have properties that distinguish them from other
kinds of parallel data. First, most words translate to only one other word.
Second, bitext correspondence is noisy. This article presents methods for
biasing statistical translation models to reflect these properties. Analysis of
the expected behavior of these biases in the presence of sparse data predicts
that they will result in more accurate models. The prediction is confirmed by
evaluation with respect to a gold standard -- translation models that are
biased in this fashion are significantly more accurate than a baseline
knowledge-poor model. This article also shows how a statistical translation
model can take advantage of various kinds of pre-existing knowledge that might
be available about particular language pairs. Even the simplest kinds of
language-specific knowledge, such as the distinction between content words and
function words, is shown to reliably boost translation model performance on
some tasks. Statistical models that are informed by pre-existing knowledge
about the model domain combine the best of both the rationalist and empiricist
traditions
Weakly-Supervised Alignment of Video With Text
Suppose that we are given a set of videos, along with natural language
descriptions in the form of multiple sentences (e.g., manual annotations, movie
scripts, sport summaries etc.), and that these sentences appear in the same
temporal order as their visual counterparts. We propose in this paper a method
for aligning the two modalities, i.e., automatically providing a time stamp for
every sentence. Given vectorial features for both video and text, we propose to
cast this task as a temporal assignment problem, with an implicit linear
mapping between the two feature modalities. We formulate this problem as an
integer quadratic program, and solve its continuous convex relaxation using an
efficient conditional gradient algorithm. Several rounding procedures are
proposed to construct the final integer solution. After demonstrating
significant improvements over the state of the art on the related task of
aligning video with symbolic labels [7], we evaluate our method on a
challenging dataset of videos with associated textual descriptions [36], using
both bag-of-words and continuous representations for text.Comment: ICCV 2015 - IEEE International Conference on Computer Vision, Dec
2015, Santiago, Chil
An information retrieval approach to ontology mapping
In this paper, we present a heuristic mapping method and a prototype mapping system that support the process of semi-automatic ontology mapping for the purpose of improving semantic interoperability in heterogeneous systems. The approach is based on the idea of semantic enrichment, i.e., using instance information of the ontology to enrich the original ontology and calculate similarities between concepts in two ontologies. The functional settings for the mapping system are discussed and the evaluation of the prototype implementation of the approach is reported. \ud
\u
An Experimental Digital Library Platform - A Demonstrator Prototype for the DigLib Project at SICS
Within the framework of the Digital Library project at SICS, this thesis describes the implementation of a demonstrator prototype of a digital library (DigLib); an experimental platform integrating several functions in one common interface. It includes descriptions of the structure and formats of the digital library collection, the tailoring of the search engine Dienst, the construction of a keyword extraction tool, and the design and development of the interface. The platform was realised through sicsDAIS, an agent interaction and presentation system, and is to be used for testing and evaluating various tools for information seeking. The platform supports various user interaction strategies by providing: search in bibliographic records (Dienst); an index of keywords (the Keyword Extraction Function (KEF)); and browsing through the hierarchical structure of the collection. KEF was developed for this thesis work, and extracts and presents keywords from Swedish documents. Although based on a comparatively simple algorithm, KEF contributes by supplying a long-felt want in the area of Information Retrieval. Evaluations of the tasks and the interface still remain to be done, but the digital library is very much up and running. By implementing the platform through sicsDAIS, DigLib can deploy additional tools and search engines without interfering with already running modules. If wanted, agents providing other services than SICS can supply, can be plugged in
- …