Search CORE

103 research outputs found

Detecting Event-Related Links and Sentiments from Social Media Texts

Author: BALAHUR DOBRESCU ALEXANDRA
TANEV Hristo
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 22/05/2013
Field of study

Nowadays, the importance of Social Media is constantly growing, as people often use such platforms to share mainstream media news and comment on the events that they relate to. As such, people no loger remain mere spectators to the events that happen in the world, but become part of them, commenting on their developments and the entities involved, sharing their opinions and distributing related content. This paper describes a system that links the main events detected from clusters of newspaper articles to tweets related to them, detects complementary information sources from the links they contain and subsequently applies sentiment analysis to classify them into positive, negative and neutral. In this manner, readers can follow the main events happening in the world, both from the perspective of mainstream as well as social media and the public's perception on them. This system is part of a media monitoring framework working live and it will be demonstrated using Google Earth.JRC.G.2-Global security and crisis managemen

JRC Publications Repository

A combined qualitative-quantitative approach for the identification of highly co-creative technology-driven firms

Author: Milyakov Hristo
Ruskov Petko
Tanev Stoyan
Publication venue
Publication date: 01/01/2010
Field of study

Crossref

Syddansk Universitets Forskerportal

JRC's Participation in the Guided Summarization Task at TAC 2010

Author: KABADJOV MIJAIL
STEINBERGER JOSEF
STEINBERGER Ralf
TANEV Hristo
Publication venue: 'National Institute of Standards and Technology (NIST)'
Publication date: 24/01/2011
Field of study

In this paper we describe our participation in the Guided Summarization Task at the Text Analysis Conference 2010 (TAC'10). The goal of the task was to encourage a deeper semantic analysis of the source documents instead of relying only on document word frequencies to select important concepts. We used the output of our event extraction system and automatic learning of semantically-related terms to capture the required aspects of each particular article category. We submitted two runs: the first uses information extraction tools in combination with co-occurrence of features, the second uses only co-occurrence information. In the following sections we describe our runs and discuss the results attained.JRC.DG.G.2-Global security and crisis managemen

JRC Publications Repository

Acronym recognition and processing in 22 languages

Author: della Rocca Leonida
Ehrmann Maud
Steinberger Ralf
Tanev Hristo
Publication venue
Publication date: 24/09/2013
Field of study

We are presenting work on recognising acronyms of the form Long-Form (Short-Form) such as "International Monetary Fund (IMF)" in millions of news articles in twenty-two languages, as part of our more general effort to recognise entities and their variants in news text and to use them for the automatic analysis of the news, including the linking of related news across languages. We show how the acronym recognition patterns, initially developed for medical terms, needed to be adapted to the more general news domain and we present evaluation results. We describe our effort to automatically merge the numerous long-form variants referring to the same short-form, while keeping non-related long-forms separate. Finally, we provide extensive statistics on the frequency and the distribution of short-form/long-form pairs across languages

arXiv.org e-Print Archive

CiteSeerX

Extracting and Learning Social Networks out of Multilingual News

Author: ATKINSON Martin
POULIQUEN Bruno
TANEV Hristo
Publication venue: SONET (SOcial NETworks)
Publication date: 21/11/2008
Field of study

Various kinds of social networks can be derived from the analysis of news articles. We present here our experience in building social networks by the extraction of relationships between entities all automatically derived from multilingual news articles. Unqualified relationships between persons can be extracted through simple co-occurrence statistics. Qualified relationships can be extracted using linguistic patterns. Our highly redundant sources (50,000 daily articles in 40 languages) are used to both validate our algorithms and strengthen pertinent relationships. Due to the amount of data we process these social networks provide a complex challenge for their useful visualization and navigation.JRC.G.2-Support to external securit

JRC Publications Repository

Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report

Author: Hürriyetoğlu Ali
Zavarella Vanni
Tanev Hristo
Yörük Erdem
Safaya Ali
Mutlu Osman
Publication venue
Publication date: 12/05/2020
Field of study

We describe our effort on automated extraction of socio-political events from news in the scope of a workshop and a shared task we organized at Language Resources and Evaluation Conference (LREC 2020). We believe the event extraction studies in computational linguistics and social and political sciences should further support each other in order to enable large scale socio-political event information collection across sources, countries, and languages. The event consists of regular research papers and a shared task, which is about event sentence coreference identification (ESCI), tracks. All submissions were reviewed by five members of the program committee. The workshop attracted research papers related to evaluation of machine learning methodologies, language resources, material conflict forecasting, and a shared task participation report in the scope of socio-political event information collection. It has shown us the volume and variety of both the data sources and event information collection approaches related to socio-political events and the need to fill the gap between automated text processing techniques and requirements of social and political sciences

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Servicio de Difusión de la Creación Intelectual

Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2022): Workshop and Shared Task Report

Author: Hürriyetoğlu Ali
Mutlu Osman
Tanev Hristo
Yeniterzi Reyyan
Yörük Erdem
Zavarella Vanni
Publication venue
Publication date: 21/11/2022
Field of study

We provide a summary of the fifth edition of the CASE workshop that is held in the scope of EMNLP 2022. The workshop consists of regular papers, two keynotes, working papers of shared task participants, and task overview papers. This workshop has been bringing together all aspects of event information collection across technical and social science fields. In addition to the progress in depth, the submission and acceptance of multimodal approaches show the widening of this interdisciplinary research topic.Comment: to appear at CASE 2022 @ EMNLP 202

arXiv.org e-Print Archive

Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media

Author: Becker Hila
Flora
Ji Heng
Khandpur Rupinder P.
Lee Wenke
Li Frank
Liu Yang
Modi A.
Muthiah Sathappan
Ovelgonne Michael
Rehurek Radim
Sabottke Carl
Soska Kyle
Tanev Hristo
Weller-Fahy David J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/02/2017
Field of study

Social media is often viewed as a sensor into various societal events such as disease outbreaks, protests, and elections. We describe the use of social media as a crowdsourced sensor to gain insight into ongoing cyber-attacks. Our approach detects a broad range of cyber-attacks (e.g., distributed denial of service (DDOS) attacks, data breaches, and account hijacking) in an unsupervised manner using just a limited fixed set of seed event triggers. A new query expansion strategy based on convolutional kernels and dependency parses helps model reporting structure and aids in identifying key event characteristics. Through a large-scale analysis over Twitter, we demonstrate that our approach consistently identifies and encodes events, outperforming existing methods.Comment: 13 single column pages, 5 figures, submitted to KDD 201

arXiv.org e-Print Archive

Crossref