Search CORE

28 research outputs found

Dealing with Semantic Heterogeneity Issues on the Web

Author: Gracia del Río Jorge
Mena Eduardo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

The Semantic Web is an extension of the traditional Web in which meaning of information is well defined, thus allowing a better interaction between people and computers. To accomplish its goals, mechanisms are required to make explicit the semantics of Web resources, to be automatically processed by software agents (this semantics being described by means of online ontologies). Nevertheless, issues arise caused by the semantic heterogeneity that naturally happens on the Web, namely redundancy and ambiguity. For tackling these issues, we present an approach to discover and represent, in a non-redundant way, the intended meaning of words in Web applications, while taking into account the (often unstructured) context in which they appear. To that end, we have developed novel ontology matching, clustering, and disambiguation techniques. Our work is intended to help bridge the gap between syntax and semantics for the Semantic Web constructio

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Ontology Matching with CIDER: evaluation report for OAEI 2011

Author: Bernad J.
Gracia del Río Jorge
Mena E.
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2011
Field of study

CIDER is a schema-based ontology alignment system. Its algorithm compares each pair of ontology terms by, ﬁrstly, extracting their ontological contexts up to a certain depth (enriched by using lightweight inference) and, secondly, combining different elementary ontology matching techniques. In its current version, CIDER uses artiﬁcial neural networks in order to combine such elementary matchers. In this paper we brieﬂy describe CIDER and comment on its results at the Ontology Alignment Evaluation Initiative 2011 campaign (OAEI’11). In this new approach, the burden of manual selection of weights has been deﬁnitely eliminated, while preserving the performance with respect to CIDER’s previous participation in the benchmark track (at OAEI’08)

Archivo Digital UPM

Associating Semantics to Multilingual Tags in Folksonomies

Author: Corcho Oscar
García-Silva A.
Gracia del Río Jorge
Publication venue: Facultad de Informática (UPM)
Publication date: 01/10/2010
Field of study

Tagging systems are nowadays a common feature in web sites where user-generated content plays an important role. However, the lack of semantics and multilinguality hamper information retrieval process based on folksonomies. In this paper we propose an approach to bring semantics to multilingual folksonomies. This approach includes a sense disambiguation activity and takes advantage from knowledge generated by the masses in the form of articles, redirection and disambiguation links, and translations in Wikipedia. We use DBpedia as semantic resource to define the tag meanings

Archivo Digital UPM

Best practises for multilingual linked open data: a community effort

Author: Gracia del Río Jorge
Labra José
Publication venue: E.T.S. de Ingenieros Informáticos (UPM)
Publication date: 01/01/2014
Field of study

The W3C Best Practises for Multilingual Linked Open Data community group was born one year ago during the last MLW workshop in Rome. Nowadays, it continues leading the effort of a numerous community towards acquiring a shared view of the issues caused by multilingualism on the Web of Data and their possible solutions. Despite our initial optimism, we found the task of identifying best practises for ML-LOD a difficult one, requiring a deep understanding of the Web of Data in its multilingual dimension and in its practical problems. In this talk we will review the progresses of the group so far, mainly in the identification and analysis of topics, use cases, and design patterns, as well as the future challenges

Archivo Digital UPM

Multilingual Variation in the context of Linked Data

Author: Aguado de Cea G.
Gracia del Río Jorge
McCrae J.
Montiel-Ponsoda Elena
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2013
Field of study

In this paper we present a revisited classification of term variation in the light of the Linked Data initiative. Linked Data refers to a set of best practices for publishing and connecting structured data on the Web with the idea of transforming it into a global graph. One of the crucial steps of this initiative is the linking step, in which datasets in one or more languages need to be linked or connected with one another. We claim that the linking process would be facilitated if datasets are enriched with lexical and terminological information. Being that the final aim, we propose a classification of lexical, terminological and semantic variants that will become part of a model of linguistic descriptions that is currently being proposed within the framework of the W3C Ontology-Lexica Community Group to enrich ontologies and Linked Data vocabularies. Examples of modeling solutions of the different types of variants are also provided

Archivo Digital UPM

Semantic enrichment of models in DynaLearn learning enviroment

Author: Bredeweg Bert
Gracia del Río Jorge
Gómez-Pérez A.
Liem Jochem
Lozano Esther
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2010
Field of study

In this work we present our contribution to the DynaLearn learning environment. DynaLearn is an interactive modeling tool for education based on the "learning by modeling" approach. DynaLearn allows students to build Qualitative Reasoning (QR) models to formally represent a domain of their interest. This process helps the students to get a better understanding of the domain and to predict the behavior of the modeled system in view of the possible changes

Archivo Digital UPM

Enabling Language Resources to expose translations as linked data on the web

Author: Aguado de Cea G.
Gracia del Río Jorge
Montiel Ponsoda Elena
Vila Suero Daniel
Publication venue: E.T.S. de Ingenieros Informáticos (UPM)
Publication date: 01/01/2014
Field of study

Language resources, such as multilingual lexica and multilingual electronic dictionaries, contain collections of lexical entries in several languages. Having access to the corresponding explicit or implicit translation relations between such entries might be of great interest for many NLP-based applications. By using Semantic Web-based techniques, translations can be available on the Web to be consumed by other (semantic enabled) resources in a direct manner, not relying on application-specific formats. To that end, in this paper we propose a model for representing translations as linked data, as an extension of the lemon model. Our translation module represents some core information associated to term translations and does not commit to specific views or translation theories. As a proof of concept, we have extracted the translations of the terms contained in Terminesp, a multilingual terminological database, and represented them as linked data. We have made them accessible on the Web both for humans (via a Web interface) and software agents (with a SPARQL endpoint)

Archivo Digital UPM

Representing Translations on the Semantic Web

Author: Aguado de Cea G.
Gracia del Río Jorge
Gómez-Pérez A.
Montiel-Ponsoda Elena
Publication venue: Facultad de Informática (UPM)
Publication date: 01/10/2011
Field of study

The increase of ontologies and data sets published in the Web in languages other than English raises some issues related to the representation of linguistic (multilingual) information in ontologies. Such linguistic descriptions can contribute to the establishment of links between ontologies and data sets described in multiple natural languages in the Linked Open Data cloud. For these reasons, several models have been proposed recently to enable richer linguistic descriptions in ontologies. Among them, we nd lemon, an RDF ontology-lexicon model that denes specic modules for dierent types of linguistic descriptions. In this contribution we propose a new module to represent translation relations between lexicons in dierent natural languages associated to the same ontology or belonging to dierent ontologies. This module can enable the representation of dierent types of translation relations, as well as translation metadata such as provenance or the reliability score of translations

Archivo Digital UPM

Verbalización de ontologías para su explotación mediante modelos de lenguaje.

Author: Adrián Espino Candalija
Bernad Lusilla Jorge Raúl
Gracia del Río Jorge Carlos
Publication venue: 'Universidad de Zaragoza'
Publication date: 01/01/2022
Field of study

Un intérprete de lenguaje natural o modelo del lenguaje (ML) es una red neuronal que sigue la arquitectura de los transformers, que a día de hoy son el estado del arte en muchas técnicas de NLP (Natural Language Processing). El objetivo de un modelo del lenguaje probabilístico es aprender la probabilidad conjunta de secuencias de palabras sobre un vocabulario. Sin embargo esto se vuelve cada vez más complicado conforme aumenta el número de palabras debido a la maldición de la dimensionalidad. Una ontología es una estructura orientada a un área temática que guarda información sobre propiedades y relaciones entre una serie de conceptos y categorías. Los modelos de lenguaje no son capaces de procesar conocimiento estructurado como el contenido por una ontología, por ello buscamos implementar un método multilingüe que extraiga el contenido explicito en las ontologías obteniendo una descripción escrita en lenguaje ingles natural. El objetivo principal de este trabajo es emplear un corpus de verbalizaciones de distintas ontologías para ser procesado por un modelo del lenguaje capaz de encontrar nuevas relaciones de equivalencia entre las ontologías empleadas. Adicionalmente el método busca hacer más accesible y explicable el conocimiento que contiene la ontología a humanos no expertos en ontologías. Existen en la web varios métodos de verbalización ya implementados, han sido utilizados como fuente de inspiración con objeto de no reinventar la rueda e implementar un método de verbalización útil y de buena calidad. Los resultados obtenidos indican que el corpus de verbalizaciones ha sido útil para encontrar nuevas relaciones de equivalencia, la mejoría de la medida F1-score depende del modelo del lenguaje empleado, se ha conseguido un aumento medio del 8.361% en modelos BERT base y de un 4.5% en modelos BERT large. Palabras clave: ontología, transformers, modelo del lenguaje, red neuronal, procesamiento de lenguaje natural, OWL, Word embedding, BERT, Roberta.<br /

Repositorio Universidad de Zaragoza

Language resources and linked data: a practical perspective

Author: Baron Ciro
Dojchinovski Milan
Flati Tiziano
Gracia del Río Jorge
McCra John P.
Vila Suero Daniel
Publication venue: E.T.S. de Ingenieros Informáticos (UPM)
Publication date: 01/01/2014
Field of study

Recently, experts and practitioners in language resources have started recognizing the benefits of the linked data (LD) paradigm for the representation and exploitation of linguistic data on the Web. The adoption of the LD principles is leading to an emerging ecosystem of multilingual open resources that conform to the Linguistic Linked Open Data Cloud, in which datasets of linguistic data are interconnected and represented following common vocabularies, which facilitates linguistic information discovery, integration and access. In order to contribute to this initiative, this paper summarizes several key aspects of the representation of linguistic information as linked data from a practical perspective. The main goal of this document is to provide the basic ideas and tools for migrating language resources (lexicons, corpora, etc.) as LD on the Web and to develop some useful NLP tasks with them (e.g., word sense disambiguation). Such material was the basis of a tutorial imparted at the EKAW’14 conference, which is also reported in the paper

Archivo Digital UPM