Search CORE

8 research outputs found

Extracting and Visualizing Quotations from News Wires

Author: Denis Pascal
Mignot Victor
Recourcé Gaëlle
Sagot Benoît
Stern Rosa
Villemonte de La Clergerie Éric
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

International audienceWe introduce SAPIENS, a platform for extracting quotations from news wires, associated with their author and context. The originality of SAPIENS is that it relies on a deep linguistic processing chain, which allows for extracting quotations with a wide coverage and an extended definition, including quotations which are only partially quotes-delimited verbatim transcripts. We describe the architecture of SAPIENS and how it was applied to process a corpus of French news wires from the AFP news agency

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Convertir des dérivations TAG en dépendances

Author: Villemonte de La Clergerie Éric
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceLes structures de dépendances syntaxiques sont importantes et bien adaptées comme point de départ de diverses applications. Dans le cadre de l'analyseur TAG FRMG, nous présentons les détails d'un processus de conversion de forêts partagées de dérivations en forêts partagées de dépendances. Des éléments d'information sont fournis sur un algorithme de désambiguisation sur ces forêts de dépendances

INRIA a CCSD electronic archive server

Hal-Diderot

Convertir des dérivations TAG en dépendances

Author: Villemonte de La Clergerie Éric
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

INRIA a CCSD electronic archive server

Extracting and Attributing Quotes in Text and Assessing them as Opinions

Author: O'Keefe Timothy William
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2014
Field of study

News articles often report on the opinions that salient people have about important issues. While it is possible to infer an opinion from a person's actions, it is much more common to demonstrate that a person holds an opinion by reporting on what they have said. These instances of speech are called reported speech, and in this thesis we set out to detect instances of reported speech, attribute them to their speaker, and to identify which instances provide evidence of an opinion. We first focus on extracting reported speech, which involves finding all acts of communication that are reported in an article. Previous work has approached this task with rule-based methods, however there are several factors that confound these approaches. To demonstrate this, we build a corpus of 965 news articles, where we mark all instances of speech. We then show that a supervised token-based approach outperforms all of our rule-based alternatives, even in extracting direct quotes. Next, we examine the problem of finding the speaker of each quote. For this task we annotate the same 965 news articles with links from each quote to its speaker. Using this, and three other corpora, we develop new methods and features for quote attribution, which achieve state-of-the-art accuracy on our corpus and strong results on the others. Having extracted quotes and determined who spoke them, we move on to the opinion mining part of our work. Most of the task definitions in opinion mining do not easily work with opinions in news, so we define a new task, where the aim is to classify whether quotes demonstrate support, neutrality, or opposition to a given position statement. This formulation improved annotator agreement when compared to our earlier annotation schemes. Using this we build an opinion corpus of 700 news documents covering 7 topics. In this thesis we do not attempt this full task, but we do present preliminary results

Sydney eScholarship

Extracting and Visualizing Quotations from News Wires

Author: B. Sagot
B. Sagot
F. Thomasset
S. Lappin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

Attribution: a computational approach

Author: Pareti Silvia
Publication venue: The University of Edinburgh
Publication date: 26/11/2015
Field of study

Our society is overwhelmed with an ever growing amount of information. Effective management of this information requires novel ways to filter and select the most relevant pieces of information. Some of this information can be associated with the source or sources expressing it. Sources and their relation to what they express affect information and whether we perceive it as relevant, biased or truthful. In news texts in particular, it is common practice to report third-party statements and opinions. Recognizing relations of attribution is therefore a necessary step toward detecting statements and opinions of specific sources and selecting and evaluating information on the basis of its source. The automatic identification of Attribution Relations has applications in numerous research areas. Quotation and opinion extraction, discourse and factuality have all partly addressed the annotation and identification of Attribution Relations. However, disjoint efforts have provided a partial and partly inaccurate picture of attribution. Moreover, these research efforts have generated small or incomplete resources, thus limiting the applicability of machine learning approaches. Existing approaches to extract Attribution Relations have focused on rule-based models, which are limited both in coverage and precision. This thesis presents a computational approach to attribution that recasts attribution extraction as the identification of the attributed text, its source and the lexical cue linking them in a relation. Drawing on preliminary data-driven investigation, I present a comprehensive lexicalised approach to attribution and further refine and test a previously defined annotation scheme. The scheme has been used to create a corpus annotated with Attribution Relations, with the goal of contributing a large and complete resource than can lay the foundations for future attribution studies. Based on this resource, I developed a system for the automatic extraction of attribution relations that surpasses traditional syntactic pattern-based approaches. The system is a pipeline of classification and sequence labelling models that identify and link each of the components of an attribution relation. The results show concrete opportunities for attribution-based applications

Edinburgh Research Archive