2 research outputs found
Lexical data API
This API provides data from various dictionary resources of K Dictionaries across 50 languages. It is used by language service providers, app developers, and researchers, and returns data as JSON documents. A basic search result consists of an object containing partial lexical information on entries that match the search criteria, but further in-depth information is also available. Basic search parameters include the source resource, source language, and text (lemma), and the entries are returned as objects within the results array. It is possible to look for words with specific syntactic criteria, specifying the part of speech, grammatical number, gender and subcategorization, monosemous or polysemous entries. When searching by parameters, each entry result contains a unique entry ID, and each sense has its own unique sense ID. Using these IDs, it is possible to obtain more data – such as syntactic and semantic information, multiword expressions, examples of usage, translations, etc. – of a single entry or sense. The software demonstration includes a brief overview of the API with practical examples of its operation
A multilingual evaluation dataset for monolingual word sense alignment
Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is
carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such
as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range
of languages and resources and focuses on the more challenging task of linking general-purpose language. We believe that our data will
pave the way for further advances in alignment and evaluation of word senses by creating new solutions, particularly those notoriously
requiring data such as neural networks. Our resources are publicly available at https://github.com/elexis-eu/MWSA.The authors would like to thank the three
anonymous reviewers for their insightful
suggestions and careful reading of the
manuscript. This work has received funding
from the EU’s Horizon 2020 Research and
Innovation programme through the ELEXIS
project under grant agreement No. 731015.
The contributions in Bulgarian were partially funded by the Bulgarian National Interdisciplinary Research e-Infrastructure for Resources
and Technologies in favor of the Bulgarian Language
and Cultural Heritage, part of the EU infrastructures
CLARIN and DARIAH – CLaDA-BG, Grant number DO1-
272/16.12.2019. This work is also supported by Sci-
ence Foundation Ireland (SFI) under the Insight Center for
Data Analytics (Grant Number SFI/12/RC/2289) and the
Irish Research Council under the “Cardamom” Consolidator Laureate Grant (IRCLA/2017/129).peer-reviewed2020-05-1