21 research outputs found
Editorial of the special issue on latest advancements in linguistic linked data
Since the inception of the Open Linguistics Working Group in 2010, there have been numerous efforts in transforming language resources into Linked Data. The research field of Linguistic Linked Data (LLD) has gained in importance, visibility and impact, with the Linguistic Linked Open Data (LLOD) cloud gathering nowadays over 200 resources. With this increasing growth, new challenges have emerged concerning particular domain and task applications, quality dimensions, and linguistic features to take into account. This special issue aims to review and summarize the progress and status of LLD research in recent years, as well as to offer an understanding of the challenges ahead of the field for the years to come. The papers in this issue indicate that there are still aspects to address for a wider community adoption of LLD, as well as a lack of resources for specific tasks and (interdisciplinary) domains. Likewise, the integration of LLD resources into Natural Language Processing (NLP) architectures and the search for long-term infrastructure solutions to host LLD resources continue to be essential points to which to attend in the foreseeable future of the research line
Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling
This paper presents a Kernel Entity Salience Model (KESM) that improves text
understanding and retrieval by better estimating entity salience (importance)
in documents. KESM represents entities by knowledge enriched distributed
representations, models the interactions between entities and words by kernels,
and combines the kernel scores to estimate entity salience. The whole model is
learned end-to-end using entity salience labels. The salience model also
improves ad hoc search accuracy, providing effective ranking features by
modeling the salience of query entities in candidate documents. Our experiments
on two entity salience corpora and two TREC ad hoc search datasets demonstrate
the effectiveness of KESM over frequency-based and feature-based methods. We
also provide examples showing how KESM conveys its text understanding ability
learned from entity salience to search
Language resources and linked data: a practical perspective
Recently, experts and practitioners in language resources
have started recognizing the benefits of the linked data (LD) paradigm
for the representation and exploitation of linguistic data on the Web.
The adoption of the LD principles is leading to an emerging ecosystem of
multilingual open resources that conform to the Linguistic Linked Open
Data Cloud, in which datasets of linguistic data are interconnected and
represented following common vocabularies, which facilitates linguistic
information discovery, integration and access. In order to contribute to
this initiative, this paper summarizes several key aspects of the representation
of linguistic information as linked data from a practical perspective.
The main goal of this document is to provide the basic ideas and
tools for migrating language resources (lexicons, corpora, etc.) as LD on
the Web and to develop some useful NLP tasks with them (e.g., word
sense disambiguation). Such material was the basis of a tutorial imparted
at the EKAW’14 conference, which is also reported in the paper
A survey of guidelines and best practices for the generation, interlinking, publication, and validation of linguistic linked data
This article discusses a survey carried out within the NexusLinguarum COST Action which aimed to give an overview of existing guidelines (GLs) and best practices (BPs) in linguistic linked data. In particular it focused on four core tasks in the production/publication of linked data: generation, interlinking, publication, and validation. We discuss the importance of GLs and BPs for LLD before describing the survey and its results in full. Finally we offer a number of directions for future work in order to address the findings of the survey
Cross-Lingual Link Discovery for Under-Resourced Languages
CC BY-NC 4.0In this paper, we provide an overview of current technologies for cross-lingual link discovery, and we discuss challenges,
experiences and prospects of their application to under-resourced languages. We first introduce the goals of cross-lingual
linking and associated technologies, and in particular, the role that the Linked Data paradigm (Bizer et al., 2011) applied
to language data can play in this context. We define under-resourced languages with a specific focus on languages actively
used on the internet, i.e., languages with a digitally versatile speaker community, but limited support in terms of language
technology. We argue that languages for which considerable amounts of textual data and (at least) a bilingual word list are
available, techniques for cross-lingual linking can be readily applied, and that these enable the implementation of downstream
applications for under-resourced languages via the localisation and adaptation of existing technologies and resources
DEVELOPING MASHUP APPLICATIONS USING EMML
V diplomskem delu podrobno predstavimo podjetniške sestavljanke in jezik EMML. Obdelamo arhitekturo podjetniških sestavljank za lažje identificiranje izzivov, ki jih le-te prinašajo, in izpostavimo potrebo po vpeljavi podjetniških sestavljank v podjetjih. Sledi podroben opis jedra jezika EMML kot standarda za razvoj podjetniških sestavljank. Izpostavimo prednosti, ki jih jezik EMML prinaša, ter identificiramo morebitne ovire. Razvoj sestavljank z jezikom EMML prikažemo na praktičnem primeru z izdelavo sestavljank za nadzor mednarodne izmenjave študentov.In this final work we present in details the enterprise mashups and the Enterprise Mashup Markup Language. We go through the enterprise mashups architecture for easier identificaton of the challenges they bring and we stress the need for implementation of enterprise mashups in enterprises. After that follows a detailed description of the core of the EMML as an enterprise mashup development standard. We present the advantages the EMML language brings and we identify possible obstacles. Developing enterprise mashups using EMML is presented on a practical case with development of a mashups for supervision of international student exchanges