Search CORE

7 research outputs found

Neural approaches to sequence labeling for information extraction

Author: Bekoulis Ioannis
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2019
Field of study

Een belangrijk aspect binnen artificiële intelligentie (AI) is het interpreteren van menselijke taal uitgedrukt in tekstuele (geschreven) vorm: natural Language processing (NLP) is belangrijk gezien tekstuele informatie nuttig is voor veel toepassingen. Toch is het verstaan ervan (zogenaamde natural Language understanding, (NLU) een uitdaging, gezien de ongestructureerde vorm van tekst, waarvan de betekenis vaak dubbelzinnig en contextafhankelijk is. In dit proefschrift introduceren we oplossingen voor tekortkomingen van gerelateerd werk bij het behandelen van fundamentele taken in natuurlijke taalverwerking, zoals named entity recognition (i.e. het identificeren van de entiteiten die in een zin voorkomen) en relatie-extractie (het identificeren van relaties tussen entiteiten). Vertrekkend van een specifiek probleem (met name het identificeren van de structuur van een huis aan de hand van een tekstueel zoekertje), bouwen we stapsgewijs een complete (geautomatiseerde) oplossing voor de bovengenoemde taken, op basis van neutrale netwerkarchitecturen. Onze oplossingen zijn algemeen toepasbaar op verschillende toepassingsdomeinen en talen. We beschouwen daarnaast ook de taak van het identificeren van relevante gebeurtenissen tijdens een evenement (bv. een doelpunt tijdens een voetbalwedstrijd), in informatiestromen op Twitter. Meer bepaald formuleren we dit probleem als het labelen van woord sequenties (vergelijkbaar met named entity recognition), waarbij we de chronologische relatie tussen opeenvolgende tweets benutten

Ghent University Academic Bibliography

Reconstructing the house from the ad: Structured prediction on real estate classifieds

Author: Bekoulis Ioannis
Deleu Johannes
Demeester Thomas
Develder Chris
Publication venue
Publication date: 01/01/2017
Field of study

Crossref

Ghent University Academic Bibliography

Adversarial training for multi-context joint entity and relation extraction

Author: Bekoulis Ioannis
Deleu Johannes
Demeester Thomas
Develder Chris
Publication venue
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Predicting suicide risk from online postings in Reddit : the UGent-IDLab submission to the CLPysch 2019 Shared Task A

Author: Bekoulis Ioannis
Bitew Semere Kiros
Deleu Johannes
Demeester Thomas
Develder Chris
Sterckx Lucas
Zaporojets Klim
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

This paper describes IDLab’s text classification systems submitted to Task A as part of the CLPsych 2019 shared task. The aim of this shared task was to develop automated systems that predict the degree of suicide risk of people based on their posts on Reddit. Bag-of-words features, emotion features and post level predictions are used to derive user-level predictions. Linear models and ensembles of these models are used to predict final scores. We find that predicting fine-grained risk levels is much more difficult than flagging potentially at-risk users. Furthermore, we do not find clear added value from building richer ensembles compared to simple baselines, given the available training data and the nature of the prediction task

Crossref

Ghent University Academic Bibliography

Joint entity recognition and relation extraction as a multi-head selection problem

Author: Bekoulis Ioannis
Deleu Johannes
Demeester Thomas
Develder Chris
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

State-of-the-art models for joint entity recognition and relation extraction strongly rely on external natural language processing (NLP) tools such as POS (part-of-speech) taggers and dependency parsers. Thus, the performance of such joint models depends on the quality of the features obtained from these NLP tools. However, these features are not always accurate for various languages and contexts. In this paper, we propose a joint neural model which performs entity recognition and relation extraction simultaneously, without the need of any manually extracted features or the use of any external tool. Specifically, we model the entity recognition task using a CRF (Conditional Random Fields) layer and the relation extraction task as a multi-head selection problem (i.e., potentially identify multiple relations for each entity). We present an extensive experimental setup, to demonstrate the effectiveness of our method using datasets from various contexts (i.e., news, biomedical, real estate) and languages (i.e., English, Dutch). Our model outperforms the previous neural models that use automatically extracted features, while it performs within a reasonable margin of feature-based neural models, or even beats them. (C) 2018 Elsevier Ltd. All rights reserved

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Sub-event detection from Twitter streams as a sequence labeling problem

Author: Bekoulis Ioannis
Deleu Johannes
Demeester Thomas
Develder Chris
Publication venue
Publication date: 01/01/2019
Field of study

This paper introduces improved methods for sub-event detection in social media streams, by applying neural sequence models not only on the level of individual posts, but also directly on the stream level. Current approaches to identify sub-events within a given event, such as a goal during a soccer match, essentially do not exploit the sequential nature of social media streams. We address this shortcoming by framing the sub-event detection problem in social media streams as a sequence labeling task and adopt a neural sequence architecture that explicitly accounts for the chronological order of posts. Specifically, we (i) establish a neural baseline that outperforms a graph-based state-of-the-art method for binary sub-event detection (2.7% micro-F1 improvement), as well as (ii) demonstrate superiority of a recurrent neural network model on the posts sequence level for labeled sub-events (2.4% bin-level F1 improvement over non-sequential models).Comment: NAACL 201

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

POLICY BRIEF #36: COVID-19 information on social media in Europe: a threat or a blessing?

Author: Bekoulis Ioannis
Deligiannis Nikolaos
Huu Tien Do
Komorowski Marlen
Picone Ike
Publication venue
Publication date: 03/06/2020
Field of study

Europe has been facing unprecedented challenges from COVID-19. Providing citizens with accurate, timely and frequent information about the health risks posed by COVID-19, as well as measures they can take to protect themselves, is key to mitigating the spread of the virus. Social media can, in this context, become an essential tool as it allows us to reach and engage a large number of people.But, social media can also be misused to spread misinformation. This policy brief critically questions what potential role social media play in informing Europe’s population about the COVID-19 pandemic. Using more than 50,000 tweets, we analyse who has an impact on the discussions surrounding COVID-19 on social media in Europe. With this study, we seek to contribute to the debate on the problems, but also the benefits of social media for the spread of information

Online Research @ Cardiff