24,379 research outputs found
Quootstrap: Scalable Unsupervised Extraction of Quotation-Speaker Pairs from Large News Corpora via Bootstrapping
We propose Quootstrap, a method for extracting quotations, as well as the
names of the speakers who uttered them, from large news corpora. Whereas prior
work has addressed this problem primarily with supervised machine learning, our
approach follows a fully unsupervised bootstrapping paradigm. It leverages the
redundancy present in large news corpora, more precisely, the fact that the
same quotation often appears across multiple news articles in slightly
different contexts. Starting from a few seed patterns, such as ["Q", said S.],
our method extracts a set of quotation-speaker pairs (Q, S), which are in turn
used for discovering new patterns expressing the same quotations; the process
is then repeated with the larger pattern set. Our algorithm is highly scalable,
which we demonstrate by running it on the large ICWSM 2011 Spinn3r corpus.
Validating our results against a crowdsourced ground truth, we obtain 90%
precision at 40% recall using a single seed pattern, with significantly higher
recall values for more frequently reported (and thus likely more interesting)
quotations. Finally, we showcase the usefulness of our algorithm's output for
computational social science by analyzing the sentiment expressed in our
extracted quotations.Comment: Accepted at the 12th International Conference on Web and Social Media
(ICWSM), 201
KACST Arabic Text Classification Project: Overview and Preliminary Results
Electronically formatted Arabic free-texts can be found in abundance these days on the World Wide Web, often linked to commercial enterprises and/or government organizations. Vast tracts of knowledge and relations lie hidden within these texts, knowledge that can be exploited once the correct intelligent tools have been identified and applied. For example, text mining may help with text classification and categorization. Text classification aims to automatically assign text to a predefined category based on identifiable linguistic features. Such a process has different useful applications including, but not restricted to, E-Mail spam detection, web pages content filtering, and automatic message routing. In this paper an overview of King Abdulaziz City for Science and Technology (KACST) Arabic Text Classification Project will be illustrated along with some preliminary results. This project will contribute to the better understanding and elaboration of Arabic text classification techniques
Evaluation in media texts: a cross-cultural linguistic investigation
A quantitative/interpretative approach to the comparative linguistic analysis of media texts is proposed and applied to a contrastive analysis of texts from the English-language China Daily and the UK Times to look for evidence of differences in what Labov calls âevaluation.â These differences are then correlated to differences in the roles played by the media in Britain and China in their respective societies.
The aim is to demonstrate that, despite reservations related to the Chinese texts not being written in the journalists' native language, a direct linguistic comparison of British media texts with Chinese media texts written in English can yield valuable insights into the workings of the Chinese media that supplement nonlinguistic studies
Econometrics meets sentiment : an overview of methodology and applications
The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software
The role of metaphor in shaping the identity and agenda of the United Nations: the imagining of an international community and international threat
This article examines the representation of the United Nations in speeches delivered by its Secretary General. It focuses on the role of metaphor in constructing a common âimaginingâ of international diplomacy and legitimising an international organisational identity. The SG legitimises the organisation, in part, through the delegitimisation of agents/actions/events constructed as threatening to the international community and to the well-being of mankind. It is a desire to combat the forces of menace or evil which are argued to motivate and determine the organisational agenda. This is predicated upon an international ideology of humanity in which difference is silenced and âworking towards the common goodâ is emphasised. This is exploited to rouse emotions and legitimise institutional power. Polarisation and antithesis are achieved through the employment of metaphors designed to enhance positive and negative evaluations. The article further points to the constitutive, persuasive and edifying power of topic and situationally-motivated metaphors in speech-making
Ideology of objectivity in political journalism. Attitudes, values and beliefs around truth as a possible horizon?
Desde un enfoque crĂtico-discursivo se analizan contenidos automĂĄticos y reflexivos en torno a la âobjetividadâ, como cĂłdigo estilĂstico-normativo y dispositivo cultural de contornos mĂticos, compartido por periodistas y audiencias de la informaciĂłn polĂtica. Con base en entrevistas realizadas bajo un enfoque etnogrĂĄfico entre 2012 y 2014, a profesionales de diferentes medios masivos de CĂłrdoba-Argentina, primero se discuten la auto-percepciĂłn de su rol contemporĂĄneo y las condiciones de su vĂnculo cotidiano con fuentes y acontecimientos. Dado el carĂĄcter inter-subjetivo del fenĂłmeno, en un segundo momento se incluye el contraste entre las perspectivas periodĂsticas y las percepciones de audiencias locales, recopiladas en sesiones experimentales simultĂĄneas. Mediante una estrategia de triangulaciĂłn analĂtica, se advierte un significativo vĂnculo de circularidad entre definiciones profesionales y expectativas de consumo.From a critical-discursive approach, automatic and reflexive contents are analyzed around "objectivity", as a stylistic-normative code and cultural device with mythical contours, shared by journalists and audience of political information. Based on interviews to professionals from different mass media in CĂłrdoba-Argentina (conducted under an ethnographic approach between 2012 and 2014), firstly the self-perception of their contemporary role and the conditions of their daily link with sources and events are discussed. Given the inter-subjective nature of the phenomenon, in a second moment the contrast between the journalistic perspectives and the perceptions of local audiences, gathered in simultaneous experimental sessions, is included. Through an analytical triangulation strategy, a significant circularity link between professional definitions and consumption expectations is noticed.From a critical-discursive approach, automatic and reflexive contents are analyzed around âobjectivityâ, as a stylistic-normative code and cultural device with mythical contours, shared by journalists and audience of political information. Based on interviews to professionals from different mass media in CĂłrdoba-Argentina â conducted under an ethnographic approach between 2012 and 2014 â, firstly the self-perception of their contemporary role and the conditions of their daily link with sources and events are discussed. Given the inter-subjective nature of the phenomenon, in a second moment the contrast between the journalistic perspectives and the perceptions of local audiences, gathered in simultaneous experimental sessions, is included. Through an analytical triangulation strategy, a significant circularity link between professional definitions and consumption expectations is noticed.Fil: Paz Garcia, Ana Pamela. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Centro CientĂfico TecnolĂłgico Conicet - CĂłrdoba; Argentina. Instituto de Investigaciones PsicolĂłgicas (IIPsi), CONICET - Facultad de PsicologĂa, Universidad Nacional de CĂłrdoba; Argentin
- âŠ