Search CORE

4,680 research outputs found

Multi-polarity text segmentation using graph theory

Author: Gao Wen
Huang Tiejun
Li Jia
Tian Yonghong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Text segmentation, or named text binarization, is usually an essential step for text information extraction from images and videos. However, most existing text segmentation methods have difficulties in extracting multi-polarity texts, where multi-polarity texts mean those texts with multiple colors or intensities in the same line. In this paper, we propose a novel algorithm for multi-polarity text segmentation based on graph theory. By representing a text image with an undirected weighted graph and partitioning it iteratively, multi-polarity text image can be effectively split into several single-polarity text images. As a result, these text images are then segmented by single-polarity text segmentation algorithms. Experiments on thousands of multi-polarity text images show that our algorithm can effectively segment multi-polarity texts. ? 2008 IEEE.EI

Crossref

Information extraction from multimedia web documents: an open-source platform and testbed

Author: Blanco Roi
Boato Giulia
Costanzo Andrea
Demidova Elena
Dupplaw David
Fontani Marco
Griffiths Thomas
Hare Jonathon
Johansson Richard
Lewis Paul H.
Matthews Michael
Minack Enrico
Moschitti Alessandro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2014
Field of study

The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval

Southampton (e-Prints Soton)

Method for Aspect-Based Sentiment Annotation Using Rhetorical Analysis

Author: JR Martin
L Danlos
L Page
M Taboada
S Joty
Ł Augustyniak
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/09/2017
Field of study

This paper fills a gap in aspect-based sentiment analysis and aims to present a new method for preparing and analysing texts concerning opinion and generating user-friendly descriptive reports in natural language. We present a comprehensive set of techniques derived from Rhetorical Structure Theory and sentiment analysis to extract aspects from textual opinions and then build an abstractive summary of a set of opinions. Moreover, we propose aspect-aspect graphs to evaluate the importance of aspects and to filter out unimportant ones from the summary. Additionally, the paper presents a prototype solution of data flow with interesting and valuable results. The proposed method's results proved the high accuracy of aspect detection when applied to the gold standard dataset

arXiv.org e-Print Archive

Crossref

Proceedings of the LREC workshop on partial parsing : between chunk parsing and deep parsing

Author: Kübler Sandra
Piskorski Jakub
Przepiorkowski Adam
Publication venue
Publication date: 03/11/2008
Field of study

Hochschulschriftenserver - Universität Frankfurt am Main

Evaluation in Discourse: a Corpus-Based Study

Author: Asher Nicholas
Benamara Farah
Chardon Baptiste
Mathieu Yvette Yannick
Popescu Vladimir
Publication venue: University of Illinois at Chicago Library
Publication date: 01/01/2016
Field of study

This paper describes the CASOAR corpus, the first manually annotated corpus that explores the impact of discourse structure on sentiment analysis with a study of movie reviews in French and in English as well as letters to the editor in French. While annotating opinions at the expression, the sentence or the document level is a well-established task and relatively straightforward, discourse annotation remains difficult, especially for non-experts. Therefore, combining both annotations poses several methodological problems that we address here. We propose a multi-layered annotation scheme that includes: the complete discourse structure according to the Segmented Discourse Representation Theory, the opinion orientation of elementary discourse units and opinion expressions, and their associated features. We detail each layer, explore the interactions between them and discuss our results. In particular, we examine the correlation between discourse and semantic category of opinion expressions, the impact of discourse relations on both subjectivity and polarity analysis and the impact of discourse on the determination of the overall opinion of a document. Our results demonstrate that discourse is an important cue for sentiment analysis, at least for the corpus genres we have studied

University of Illinois at Chicago: Journals@UIC

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Hal-Diderot

Dialogue & Discourse (E-Journal - Universität Bielefeld)

Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis

Author: Angelidis Stefanos
Lapata Maria
Publication venue
Publication date: 01/01/2018
Field of study

We consider the task of fine-grained sentiment analysis from the perspective of multiple instance learning (MIL). Our neural model is trained on document sentiment labels, and learns to predict the sentiment of text segments, i.e. sentences or elementary discourse units (EDUs), without segment-level supervision. We introduce an attention-based polarity scoring method for identifying positive and negative text snippets and a new dataset which we call SPOT (as shorthand for Segment-level POlariTy annotations) for evaluating MIL-style sentiment models like ours. Experimental results demonstrate superior performance against multiple baselines, whereas a judgement elicitation study shows that EDU-level opinion extraction produces more informative summaries than sentence-based alternatives.Comment: Final published version. Please cite using appropriate date (2018). Link to journal: http://www.transacl.org/ojs/index.php/tacl/article/view/1225/27

arXiv.org e-Print Archive

Edinburgh Research Explorer