Search CORE

363 research outputs found

Italian Event Detection Goes Deep Learning

Author: Caselli Tommaso
Publication venue
Publication date: 01/01/2018
Field of study

This paper reports on a set of experiments with different word embeddings to initialize a state-of-the-art Bi-LSTM-CRF network for event detection and classification in Italian, following the EVENTI evaluation exercise. The net- work obtains a new state-of-the-art result by improving the F1 score for detection of 1.3 points, and of 6.5 points for classification, by using a single step approach. The results also provide further evidence that embeddings have a major impact on the performance of such architectures.Comment: to appear at CLiC-it 201

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

OpenEdition

Dissertations of the University of Groningen

ProTestA:Identifying and Extracting Protest Events in News Notebook for ProtestNews Lab at CLEF 2019

Author: Basile Angelo
Caselli Tommaso
Publication venue: CEUR Workshop Proceedings (CEUR-WS.org)
Publication date: 01/01/2019
Field of study

This notebook describes our participation to the Protest- New Lab, identifying protest events in news articles in English. Systems are challenged to perform unsupervised domain adaptation against three sub-tasks: document classification, sentence classification, and event ex- traction. We describe the final submitted systems for all sub-tasks, as well as a series of negative results. Results indicate pretty robust perfor- mances in all tasks (average F1 of 0.705 for the document classification sub-task, average F1 of 0.592 for the sentence classification sub-task; av- erage F1 0.528 for the event extraction sub-task), ranking in the top 4 systems, although drops in the out-of-domain test sets are not minimal

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

H $_2$ ortho-to-para conversion on grains: A route to fast deuterium fractionation in dense cloud cores?

Author: Bovino Stefano
Caselli Paola
Grassi Tommaso
Schleicher Dominik R.
Publication venue: 'American Astronomical Society'
Publication date: 23/10/2017
Field of study

Deuterium fractionation, i.e. the enhancement of deuterated species with respect to the non-deuterated ones, is considered to be a reliable chemical clock of star-forming regions. This process is strongly affected by the ortho-to-para (o-p) H

_2

ratio. In this letter we explore the effect of the o-p H

_2

conversion on grains on the deuteration timescale in fully depleted dense cores, including the most relevant uncertainties that affect this complex process. We show that (i) the o-p H

_2

conversion on grains is not strongly influenced by the uncertainties on the conversion time and the sticking coefficient and (ii) that the process is controlled by the temperature and the residence time of ortho-H

_2

on the surface, i.e. by the binding energy. We find that for binding energies in between 330-550 K, depending on the temperature, the o-p H

_2

conversion on grains can shorten the deuterium fractionation timescale by orders of magnitude, opening a new route to explain the large observed deuteration fraction

D_\mathrm{frac}

in dense molecular cloud cores. Our results suggest that the star formation timescale, when estimated through the timescale to reach the observed deuteration fractions, might be shorter than previously proposed. However, more accurate measurements of the binding energy are needed to better assess the overall role of this process.Comment: Accepted for publication in ApJ Letter

arXiv.org e-Print Archive

Copenhagen University Research Information System

MPG.PuRe

All That Glitters is Not Gold:Transfer-learning for Offensive Language Detection in Dutch

Author: Caselli Tommaso
Theodoridis Dion
Publication venue
Publication date: 22/12/2022
Field of study

University of Groningen

Identifying communicative functions in discourse with content types

Author: Caselli Tommaso
Moretti Giovanni
Sprugnoli R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Texts are not monolithic entities but rather coherent collections of micro illocutionary acts which help to convey a unitary message of content and purpose. Identifying such text segments is challenging because they require a fine-grained level of analysis even within a single sentence. At the same time, accessing them facilitates the analysis of the communicative functions of a text as well as the identification of relevant information. We propose an empirical framework for modelling micro illocutionary acts at clause level, that we call content types, grounded on linguistic theories of text types, in particular on the framework proposed by Werlich in 1976. We make available a newly annotated corpus of 279 documents (for a total of more than 180,000 tokens) belonging to different genres and temporal periods, based on a dedicated annotation scheme. We obtain an average Cohen’s kappa of 0.89 at token level. We achieve an average F1 score of 74.99% on the automatic classification of content types using a bi-LSTM model. Similar results are obtained on contemporary and historical documents, while performances on genres are more varied. This work promotes a discourse-oriented approach to information extraction and cross-fertilisation across disciplines through a computationally-aided linguistic analysis

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

Dissertations of the University of Groningen