Search CORE

23 research outputs found

Exploring the automatic selection of basic level concepts

Author: Izquierdo Beviá Rubén
Rigau Claramunt German
Suárez Cueto Armando
Publication venue: INCOMA
Publication date: 01/01/2007
Field of study

We present a very simple method for selecting Base Level Concepts using basic structural properties of WordNet. We also empirically demonstrate that these automatically derived set of Base Level Concepts group senses into an adequate level of abstraction in order to perform class-based Word Sense Disambiguation. In fact a very naive Most Frequent classifier using the classes selected is able to perform a semantic tagging with accuracy figures over 75%.Union Europea bajo proyecto QALL-ME (FP6 IST-033860) y el Gobierno Español bajo el proyecto Text-Mess (TIN2006-15265-C06-01) y KNOW (TIN2006-15049-C03-01

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Word vs. Class-Based Word Sense Disambiguation

Author: Izquierdo Beviá Rubén
Rigau Claramunt German
Suárez Cueto Armando
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2015
Field of study

As empirically demonstrated by the Word Sense Disambiguation (WSD) tasks of the last SensEval/SemEval exercises, assigning the appropriate meaning to words in context has resisted all attempts to be successfully addressed. Many authors argue that one possible reason could be the use of inappropriate sets of word meanings. In particular, WordNet has been used as a de-facto standard repository of word meanings in most of these tasks. Thus, instead of using the word senses defined in WordNet, some approaches have derived semantic classes representing groups of word senses. However, the meanings represented by WordNet have been only used for WSD at a very fine-grained sense level or at a very coarse-grained semantic class level (also called SuperSenses). We suspect that an appropriate level of abstraction could be on between both levels. The contributions of this paper are manifold. First, we propose a simple method to automatically derive semantic classes at intermediate levels of abstraction covering all nominal and verbal WordNet meanings. Second, we empirically demonstrate that our automatically derived semantic classes outperform classical approaches based on word senses and more coarse-grained sense groupings. Third, we also demonstrate that our supervised WSD system benefits from using these new semantic classes as additional semantic features while reducing the amount of training examples. Finally, we also demonstrate the robustness of our supervised semantic class-based WSD system when tested on out of domain corpus.This work has been partially supported by the NewsReader project (ICT-2011-316404), the Spanish project SKaTer (TIN2012-38584-C06-02)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Una aproximación a la desambiguación del sentido de las palabras basada en clases semánticas y aprendizaje automático

Author: Izquierdo Beviá Rubén
Publication venue: 'Universidad de Alicante Servicio de Publicaciones'
Publication date: 01/01/2010
Field of study

Tesis doctoral en Informática realizada por Rubén Izquierdo en la Universidad de Alicante (UA) bajo la dirección del Dr. Armando Suárez Cueto (UA) y del Dr. German Rigau Claramunt (EHU/UPV). El acto de defensa de la tesis tuvo lugar en Alicante el 17 de Septiembre de 2010 ante el tribunal formado por los doctores Manuel Palomar (UA), Paloma Moreda (UA), María Teresa Martín (UJA), Lluís Padró (UPC) e Irene Castellón (UB). La calificación obtenida fue Sobresaliente Cum Laude por unanimidad.Ph.D Thesis in Computer Science, specifically in the field of Computational Linguistics, written by Rubén Izquierdo at the University of Alicante (UA), under the supervision of Dr. Armando Suárez Cueto (UA) and Dr. German Rigau Claramunt (EHU/UPV). The author was examined on September 17th 2010, by a panel formed by Dr. Manuel Palomar (UA), Dr. Paloma Moreda (UA), Dr. María Teresa Martín (UJA), Dr. Lluís Padró (UPC) and Dr. Irene Castellón (UB). The grade obtained was Sobresaliente Cum Laude.Este trabajo ha sido co-financiado por el Ministerio de Ciencia e Innovación (proyecto TIN2009-13391-C04-01), y la Conselleria de Educación de la Generalitat Valenciana (proyectos PROMETEO/2009/119 y ACOMP/2011/001)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

A proposal of automatic selection of coarse-grained semantic classes for WSD

Author: Izquierdo Beviá Rubén
Rigau Claramunt German
Suárez Cueto Armando
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2007
Field of study

Presentamos un método muy simple para seleccionar conceptos base (Base Level Concepts) usando algunas propiedades estructurales básicas de WordNet. Demostramos empíricamente que el conjunto de Base Level Concepts obtenido agrupa sentidos de palabras en un nivel de abstracción adecuado para la desambiguación del sentido de las palabras basada en clases. De hecho, un sencillo clasificador basado en el sentido más frecuente usando las clases generadas, es capaz de alcanzar un acierto próximo a 75% para la tarea de etiquetado semántico.We present a very simple method for selecting Base Level Concepts using some basic structural properties of WordNet. We also empirically demonstrate that these automatically derived set of Base Level Concepts group senses into an adequate level of abstraction in order to perform class-based Word Sense Disambiguation. In fact, a very naive Most Frequent classifier using the classes selected is able to perform a semantic tagging with accuracy figures over 75%.This paper has been supported by the European Union under the project QALL-ME (FP6 IST-033860) and the Spanish Government under the project Text-Mess (TIN2006-15265-C06-01) and KNOW (TIN2006-15049-C03-01

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

A probabilistic, text and knowledge-based image retrieval system

Author: Izquierdo Beviá Rubén
Saiz Noeda Maximiliano
Tomás David
Vicedo Jose-Luis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

This paper describes the development of an image retrieval system that combines probabilistic and ontological information1. The process is divided in two different stages: indexing and retrieval. Three information flows have been created with different kind of information each one: word forms, stems and stemmed bigrams. The final result combines the results obtained in the three streams. Knowledge is added to the system by means of an ontology created automatically from the St. Andrews Corpus. The system has been evaluated at CLEF05 image retrieval task.This work has been partially supported by the Spanish Government (CICYT) with grant TIC2003-07158-c04-01

Repositorio Institucional de la Universidad de Alicante

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Modelado de Categorías y Desambiguación del Sentido de las Palabras en el corpus Ancora

Author: Izquierdo Beviá Rubén
Postma Marten
Vossen Piek
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2015
Field of study

In this paper we present an approach to Word Sense Disambiguation based on Topic Modeling (LDA). Our approach consists of two different steps, where first a binary classifier is applied to decide whether the most frequent sense applies or not, and then another classifier deals with the non most frequent sense cases. An exhaustive evaluation is performed on the Spanish corpus Ancora, to analyze the performance of our two-step system and the impact of the context and the different parameters in the system. Our best experiment reaches an accuracy of 74.53, which is 6 points over the highest baseline. All the software developed for these experiments has been made freely available, to enable reproducibility and allow the re-usage of the software.En este artículo se presenta una aproximación a la Desambiguación del Sentido de las Palabras basada en Modelado de Categorías (LDA). Nuestra aproximación consiste en dos pasos diferenciados, donde primero un clasificador binario se ejecuta para decidir si la heurística del sentido más frecuente se debe aplicar, y posteriormente otro clasificador se encarga del resto de sentidos donde esta heurística no corresponde. Se ha realizado una evaluación exhaustiva en el corpus en español Ancora, para analizar el funcionamiento de nuestro sistema de dos pasos y el impacto del contexto y de diferentes parámetros en dicho sistema. Nuestro mejor experimento alcanza un acierto de 74.53, lo cual es 6 puntos superior al baseline más alto. Todo el software desarrollado para estos experimentos se ha puesto disponible libremente para permitir la reprodubilidad de los experimentos y la reutilización del software

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

An user-centred ontology- and entailment-based Question Answering system

Author: Ferrández Escámez Sergio
Ferrández Escámez Óscar
Izquierdo Beviá Rubén
Vicedo Jose-Luis
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2008
Field of study

Este artículo presenta un sistema de Búsqueda de Respuestas basado en ontologías, implicación textual y requerimientos de usuario. Se propone una metodología para la construcción de una base de conocimiento de usuario que nos permite asociar preguntas en lenguaje natural con una representación formal de datos. El núcleo de nuestra estrategia se basa en la implicación textual, la cual permite detectar implicaciones entre preguntas y la base de conocimiento. El sistema ha sido desarrollado para el español y sobre el dominio de cine obteniendo unos resultados prometedores para su utilización en entornos reales.This paper presents an user-centred ontology- and entailment-based Question Answering system. A methodology is proposed in order to carry out the construction of the user knowledge database. This knowledge database allows us to fill the gap between natural language expressions and formal expressions such as database queries. The core of the system relies on an entailment engine capable of deducting inferences between queries and the knowledge database. The system has been developed for Spanish, covering the cinema domain and obtaining very promising results within real environments.Esta investigación ha sido parcialmente financiada bajo los proyectos QALL-ME, dentro del Sexto Programa Marco de Investigación de la Unión Europea con referencia FP6-IST-033860, y el Gobierno de España proyecto CICyT número TIN2006-15265-C06-01

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

X-Not@rial : sistema de recuperación y extracción de información notarial

Author: Calle Bellido María del Carmen
Izquierdo Beviá Rubén
Llopis Fernando
Muñoz Rafael
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2003
Field of study

El sistema X-Not@rial realiza tareas de recuperación y extracción de información. Las tareas de extracción de información se realizan en el dominio notarial y más concretamente en la de las escrituras de compraventa. El sistema selecciona los documentos relacionados con escrituras de compraventa de una colección de textos heterogénea y posteriormente aplica las técnicas de extracción de información para identificar la información relevante.X-Not@rial system solves information retrieval and information extraction tasks. The information extraction tasks have been developed in deed domain. The system selects a subset of document related to deed documents. After thats, the information extraction techniques selects the relevant information

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Spanish all-words semantic class disambiguation using Cast3LB corpus

Author: Izquierdo Beviá Rubén
Moreno Monteagudo Lorenza
Navarro Colorado Borja
Suárez Cueto Armando
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

In this paper, an approach to semantic disambiguation based on machine learning and semantic classes for Spanish is presented. A critical issue in a corpus-based approach for Word Sense Disambiguation (WSD) is the lack of wide-coverage resources to automatically learn the linguistic information. In particular, all-words sense annotated corpora such as SemCor do not have enough examples for many senses when used in a machine learning method. Using semantic classes instead of senses allows to collect a larger number of examples for each class while polysemy is reduced, improving the accuracy of semantic disambiguation. Cast3LB, a SemCor-like corpus, manually annotated with Spanish WordNet 1.5 senses, has been used in this paper to perform semantic disambiguation based on several sets of classes: lexicographer files of WordNet, WordNet Domains, and SUMO ontology.This paper has been supported by the Spanish Government under projects CESS-ECE (HUM2004-21127-E) and R2D2 (TIC2003-07158-C04-01)

Repositorio Institucional de la Universidad de Alicante

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Influencia de los estilos de aprendizaje en el uso de redes sociales para docencia [póster]

Author: Garrigós Irene
Izquierdo Beviá Rubén
Mazón Jose-Norberto
Saquete Boró Estela
Vázquez Sonia
Publication venue
Publication date: 01/06/2011
Field of study

Póster presentado en las IX Jornadas de Redes de Investigación en Docencia Universitaria, Alicante, 16-17 junio 2011.Analizar los estilos de aprendizaje de los alumnos. Estudiar cómo interactúan los alumnos al usar redes sociales. Determinar cómo influye el estilo de aprendizaje en el éxito de las tareas colaborativas

Repositorio Institucional de la Universidad de Alicante