Search CORE

40 research outputs found

Adjusting Sense Representations for Word Sense Disambiguation and Automatic Pun Interpretation

Author: Miller Tristan
Publication venue: tuprints
Publication date: 04/01/2016
Field of study

Word sense disambiguation (WSD)—the task of determining which meaning a word carries in a particular context—is a core research problem in computational linguistics. Though it has long been recognized that supervised (machine learning–based) approaches to WSD can yield impressive results, they require an amount of manually annotated training data that is often too expensive or impractical to obtain. This is a particular problem for under-resourced languages and domains, and is also a hurdle in well-resourced languages when processing the sort of lexical-semantic anomalies employed for deliberate effect in humour and wordplay. In contrast to supervised systems are knowledge-based techniques, which rely only on pre-existing lexical-semantic resources (LSRs). These techniques are of more general applicability but tend to suffer from lower performance due to the informational gap between the target word's context and the sense descriptions provided by the LSR. This dissertation is concerned with extending the efficacy and applicability of knowledge-based word sense disambiguation. First, we investigate two approaches for bridging the information gap and thereby improving the performance of knowledge-based WSD. In the first approach we supplement the word's context and the LSR's sense descriptions with entries from a distributional thesaurus. The second approach enriches an LSR's sense information by aligning it to other, complementary LSRs. Our next main contribution is to adapt techniques from word sense disambiguation to a novel task: the interpretation of puns. Traditional NLP applications, including WSD, usually treat the source text as carrying a single meaning, and therefore cannot cope with the intentionally ambiguous constructions found in humour and wordplay. We describe how algorithms and evaluation methodologies from traditional word sense disambiguation can be adapted for the "disambiguation" of puns, or rather for the identification of their double meanings. Finally, we cover the design and construction of technological and linguistic resources aimed at supporting the research and application of word sense disambiguation. Development and comparison of WSD systems has long been hampered by a lack of standardized data formats, language resources, software components, and workflows. To address this issue, we designed and implemented a modular, extensible framework for WSD. It implements, encapsulates, and aggregates reusable, interoperable components using UIMA, an industry-standard information processing architecture. We have also produced two large sense-annotated data sets for under-resourced languages or domains: one of these targets German-language text, and the other English-language puns

TUbiblio

tuprints

The Future of Information Sciences : INFuture2009 : Digital Resources and Knowledge Sharing

Author
Publication venue: Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb
Publication date: 01/11/2009
Field of study

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Acquiring Thesauri from Wikipedia

Author: Novák Ján
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2011
Field of study

Tato práce se věnuje problematice automatické tvorby tezauru z Wikipedie. Obsahuje popis struktury Wikipedie jako vhodné datové sady pro tvorbu tezauru a popisuje některé metody výpočtu sémantické blízkosti termínů, které budou využity při tvorbě tezauru. Dále obsahuje popis návrhu a implementace systému pro automatickou tvorbu tezauru z Wikipedie. Na závěr je provedeno vyhodnocení výsledků systému.This thesis deals with automatic acquiring thesauri from Wikipedia. It describes Wikipedia as a suitable data set for thesauri acquiring and also methods for computing semantic similarity of terms are described. The thesis also contains a description of concepts and implementation of the system for automatic thesauri acquiring. Finally, the implemented system is evaluated by the standard metrics, such as precision or recall.

Digital library of Brno University of Technology

National Repository of Grey Literature

Metadata for semantic and social applications

Author: Klas Wolfgang and Greenberg, Jane
Publication venue: University of gottingen
Publication date: 22/09/2008
Field of study

Metabiblioteca-Biblioteca Digital Libros Abiertos

Changing Higher Education Learning with Web 2.0 and Open Education Citation, Annotation, and Thematic Coding Appendices

Author: Fitt M. Harrison
Leary Heather
Wiley David
Publication venue: Hosted by Utah State University Libraries
Publication date: 01/10/2009
Field of study

Appendices of citations, annotations and themes for research conducted on four websites: Delicious, Wikipedia, YouTube, and Facebook

DigitalCommons@USU

Community-driven & Work-integrated Creation, Use and Evolution of Ontological Knowledge Structures

Author: Braun Simone
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2012
Field of study

KITopen

Ontology mapping with auxiliary resources

Author: Schadd Frederik
Publication venue: 'University of Maastricht'
Publication date: 01/01/2015
Field of study

Maastricht University Research Portal

The People’s Encyclopedia Under the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia

Author: Lanamäki Arto
Mehdi Mohamad
Mesgari Mostafa
Nielsen Finn Årup
Okoli Chitu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Crossref

Online Research Database In Technology

Natural language processing for semiautomatic semantics extractio: encyclopedic entry disambiguation and relationship extraction using wikipedia and wordnet

Author: Ruiz-Casado María
Publication venue
Publication date: 01/01/2009
Field of study

Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, septiembre de 200

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

On the Mono- and Cross-Language Detection of Text Re-Use and Plagiarism

Author: Barrón Cedeño Luis Alberto
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 08/06/2012
Field of study

Barrón Cedeño, LA. (2012). On the Mono- and Cross-Language Detection of Text Re-Use and Plagiarism [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16012Palanci

RiuNet