Search CORE

6,421 research outputs found

Adapting a relation extraction pipeline for the BioCreAtIvE II task

Author: Grover Claire
Haddow Barry
Klein Ewan
Matthews Michael
Nielsen Leif Arda
Tobin Richard
Wang Xinglong
Publication venue
Publication date: 01/01/2007
Field of study

USFD at KBP 2011: Entity Linking, Slot Filling and Temporal Bounding

Author: Alhelbawy Ayman
Burman Amev
Derczynski Leon
Gaizauskas Robert
Jayapal Arun
Kannan Sathish
Kavilikatta Madhu
Publication venue
Publication date: 01/01/2011
Field of study

This paper describes the University of Sheffield's entry in the 2011 TAC KBP entity linking and slot filling tasks. We chose to participate in the monolingual entity linking task, the monolingual slot filling task and the temporal slot filling tasks. We set out to build a framework for experimentation with knowledge base population. This framework was created, and applied to multiple KBP tasks. We demonstrated that our proposed framework is effective and suitable for collaborative development efforts, as well as useful in a teaching environment. Finally we present results that, while very modest, provide improvements an order of magnitude greater than our 2010 attempt.Comment: Proc. Text Analysis Conference (2011

arXiv.org e-Print Archive

CiteSeerX

On the Feasibility of Automated Detection of Allusive Text Reuse

Author: Kestemont Mike
Long Brian
Manjavacas Enrique
Publication venue
Publication date: 01/01/2019
Field of study

The detection of allusive text reuse is particularly challenging due to the sparse evidence on which allusive references rely---commonly based on none or very few shared words. Arguably, lexical semantics can be resorted to since uncovering semantic relations between words has the potential to increase the support underlying the allusion and alleviate the lexical sparsity. A further obstacle is the lack of evaluation benchmark corpora, largely due to the highly interpretative character of the annotation process. In the present paper, we aim to elucidate the feasibility of automated allusion detection. We approach the matter from an Information Retrieval perspective in which referencing texts act as queries and referenced texts as relevant documents to be retrieved, and estimate the difficulty of benchmark corpus compilation by a novel inter-annotator agreement study on query segmentation. Furthermore, we investigate to what extent the integration of lexical semantic information derived from distributional models and ontologies can aid retrieving cases of allusive reuse. The results show that (i) despite low agreement scores, using manual queries considerably improves retrieval performance with respect to a windowing approach, and that (ii) retrieval performance can be moderately boosted with distributional semantics

arXiv.org e-Print Archive

Recognizing and organizing opinions expressed in the world press

Author: Riloff Ellen M.
Wiebe Janyce
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/01/2003
Field of study

Journal ArticleTomorrow's question answering systems will need to have the ability to process information about beliefs, opinions, and evaluations-the perspective of an agent. Answers to many simple factual questions-even yes/no questions-are affected by the perspective of the information source. For example, a questioner asking question (1) might be interested to know that, in general, sources in European and North American governments tend to answer "no" to question (1), while sources in African governments tend to answer "yes:

Legal argumentation concerning Almost Identical Expressions (AIE) in statutory texts

Author: Araszkiewicz Michał
Łopatkiewicz Agata
Publication venue: CEUR
Publication date: 01/01/2015
Field of study

Jagiellonian Univeristy Repository

Annotation Studio: multimedia text annotation for students

Author: Elyse Graham
James Paradis
James Paradis
Jamie Folsom
Julia Pankow
Kurt E Fendt
Kurt E Fendt
Lakshmi Subbaraj
Wyn Kelley
Publication venue: 'Modern Language Association'
Publication date: 01/01/2013
Field of study

Annotation Studio will be a web-based application that actively engages students in interpreting literary texts and other humanities documents. While strengthening students' new media literacies, this open source web application will develop traditional humanistic skills including close reading, textual analysis, persuasive writing, and critical thinking. Initial features will include: 1) easy-to-use annotation tools that facilitate linking and comparing primary texts with multi-media source, variation, and adaptation documents; 2) sharable collections of multimedia materials prepared by faculty and student users; 3) multiple filtering and display mechanisms for texts, written annotations, and multimedia annotations; 4) collaboration functionality; and 5) multimedia composition tools. Products of the start-up phase will include a working prototype, feedback from students and instructors, and a white paper summarizing lessons learned

Using term clouds to represent segment-level semantic content of podcasts

Author: Besser Jana
de Rijke Maarten
Fuller Marguerite
Jones Gareth J.F.
Larson Martha
Newman Eamonn
Tsagkias Manos
Publication venue
Publication date: 01/01/2008
Field of study

Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts generated by automatic speech recognition (ASR). This paper examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript generated by automatic speech recognition (ASR). Quality of segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries

UvA-DARE