Search CORE

3 research outputs found

Measuring Semantic Similarity: Representations and Methods

Author: Lintean Mihai Cosmin
Publication venue: University of Memphis Digital Commons
Publication date: 25/07/2011
Field of study

This dissertation investigates and proposes ways to quantify and measure semantic similarity between texts. The general approach is to rely on linguistic information at various levels, including lexical, lexico-semantic, and syntactic. The approach starts by mapping texts onto structured representations that include lexical, lexico-semantic, and syntactic information. The representation is then used as input to methods designed to measure the semantic similarity between texts based on the available linguistic information.While world knowledge is needed to properly assess semantic similarity of texts, in our approach world knowledge is not used, which is a weakness of it.We limit ourselves to answering the question of how successfully one can measure the semantic similarity of texts using just linguistic information.The lexical information in the original texts is retained by using the words in the corresponding representations of the texts. Syntactic information is encoded using dependency relations trees, which represent explicitly the syntactic relations between words. Word-level semantic information is relatively encoded through the use of semantic similarity measures like WordNet Similarity or explicitly encoded using vectorial representations such as Latent Semantic Analysis (LSA). Several methods are being studied to compare the representations, ranging from simple lexical overlap, to more complex methods such as comparing semantic representations in vector spaces as well as syntactic structures. Furthermore, a few powerful kernel models are proposed to use in combination with Support Vector Machine (SVM) classifiers for the case in which the semantic similarity problem is modeled as a classification task

University of Memphis Digital Commons

Analysis, optimization and development of an answer scoring system

Author: López Gazpio Iñigo
Publication venue
Publication date: 01/01/2014
Field of study

The main contribution of this work is to analyze and describe the state of the art performance as regards answer scoring systems from the SemEval- 2013 task, as well as to continue with the development of an answer scoring system (EHU-ALM) developed in the University of the Basque Country. On the overall this master thesis focuses on finding any possible configuration that lets improve the results in the SemEval dataset by using attribute engineering techniques in order to find optimal feature subsets, along with trying different hierarchical configurations in order to analyze its performance against the traditional one versus all approach. Altogether, throughout the work we propose two alternative strategies: on the one hand, to improve the EHU-ALM system without changing the architecture, and, on the other hand, to improve the system adapting it to an hierarchical con- figuration. To build such new models we describe and use distinct attribute engineering, data preprocessing, and machine learning techniques

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital para la Docencia y la Investigación

Assessing forward-, reverse-, and average-entailer indices on natural language input from the Intelligent Tutoring System, iSTART

Author: Crossley Scott A.
Graesser Arthur C.
McCarthy Philip M.
McNamara Danielle S.
Rus Vasile
Publication venue: University of Memphis Digital Commons
Publication date: 17/11/2008
Field of study

This study reports on an experiment that analyzes a variety of entailment evaluations provided by a lexico-syntactic tool, the Entailer. The environment for these analyses is from a corpus of self-explanations taken from the Intelligent Tutoring System, iSTART. The purpose of this study is to examine how evaluations of hand-coded entailment, paraphrase, and elaboration compare to various evaluations provided by the Entailer. The evaluations include standard entailment (forward) as well as the new indices of Reverse- and Average-Entailment. The study finds that the Entailer\u27s indices match or surpass human evaluators in making textual evaluations. The findings have important implications for providing accurate and appropriate feedback to users of Intelligent Tutoring Systems. Copyright © 2008, American Association for Artificial Intelligence (www.aaai.org). All rights reserved

University of Memphis Digital Commons