37,973 research outputs found
Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding
Entity alignment is the task of finding entities in two knowledge bases (KBs)
that represent the same real-world object. When facing KBs in different natural
languages, conventional cross-lingual entity alignment methods rely on machine
translation to eliminate the language barriers. These approaches often suffer
from the uneven quality of translations between languages. While recent
embedding-based techniques encode entities and relationships in KBs and do not
need machine translation for cross-lingual entity alignment, a significant
number of attributes remain largely unexplored. In this paper, we propose a
joint attribute-preserving embedding model for cross-lingual entity alignment.
It jointly embeds the structures of two KBs into a unified vector space and
further refines it by leveraging attribute correlations in the KBs. Our
experimental results on real-world datasets show that this approach
significantly outperforms the state-of-the-art embedding approaches for
cross-lingual entity alignment and could be complemented with methods based on
machine translation
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis (Extended)
This extended paper presents 1) a novel hierarchy and recursion extension to
the process tree model; and 2) the first, recursion aware process model
discovery technique that leverages hierarchical information in event logs,
typically available for software systems. This technique allows us to analyze
the operational processes of software systems under real-life conditions at
multiple levels of granularity. The work can be positioned in-between reverse
engineering and process mining. An implementation of the proposed approach is
available as a ProM plugin. Experimental results based on real-life (software)
event logs demonstrate the feasibility and usefulness of the approach and show
the huge potential to speed up discovery by exploiting the available hierarchy.Comment: Extended version (14 pages total) of the paper Recursion Aware
Modeling and Discovery For Hierarchical Software Event Log Analysis. This
Technical Report version includes the guarantee proofs for the proposed
discovery algorithm
- …