363 research outputs found
Italian Event Detection Goes Deep Learning
This paper reports on a set of experiments with different word embeddings to
initialize a state-of-the-art Bi-LSTM-CRF network for event detection and
classification in Italian, following the EVENTI evaluation exercise. The net-
work obtains a new state-of-the-art result by improving the F1 score for
detection of 1.3 points, and of 6.5 points for classification, by using a
single step approach. The results also provide further evidence that embeddings
have a major impact on the performance of such architectures.Comment: to appear at CLiC-it 201
ProTestA:Identifying and Extracting Protest Events in News Notebook for ProtestNews Lab at CLEF 2019
This notebook describes our participation to the Protest- New Lab, identifying protest events in news articles in English. Systems are challenged to perform unsupervised domain adaptation against three sub-tasks: document classification, sentence classification, and event ex- traction. We describe the final submitted systems for all sub-tasks, as well as a series of negative results. Results indicate pretty robust perfor- mances in all tasks (average F1 of 0.705 for the document classification sub-task, average F1 of 0.592 for the sentence classification sub-task; av- erage F1 0.528 for the event extraction sub-task), ranking in the top 4 systems, although drops in the out-of-domain test sets are not minimal
H ortho-to-para conversion on grains: A route to fast deuterium fractionation in dense cloud cores?
Deuterium fractionation, i.e. the enhancement of deuterated species with
respect to the non-deuterated ones, is considered to be a reliable chemical
clock of star-forming regions. This process is strongly affected by the
ortho-to-para (o-p) H ratio. In this letter we explore the effect of the
o-p H conversion on grains on the deuteration timescale in fully depleted
dense cores, including the most relevant uncertainties that affect this complex
process. We show that (i) the o-p H conversion on grains is not strongly
influenced by the uncertainties on the conversion time and the sticking
coefficient and (ii) that the process is controlled by the temperature and the
residence time of ortho-H on the surface, i.e. by the binding energy. We
find that for binding energies in between 330-550 K, depending on the
temperature, the o-p H conversion on grains can shorten the deuterium
fractionation timescale by orders of magnitude, opening a new route to explain
the large observed deuteration fraction in dense molecular
cloud cores. Our results suggest that the star formation timescale, when
estimated through the timescale to reach the observed deuteration fractions,
might be shorter than previously proposed. However, more accurate measurements
of the binding energy are needed to better assess the overall role of this
process.Comment: Accepted for publication in ApJ Letter
Identifying communicative functions in discourse with content types
Texts are not monolithic entities but rather coherent collections of micro illocutionary acts which help to convey a unitary message of content and purpose. Identifying such text segments is challenging because they require a fine-grained level of analysis even within a single sentence. At the same time, accessing them facilitates the analysis of the communicative functions of a text as well as the identification of relevant information. We propose an empirical framework for modelling micro illocutionary acts at clause level, that we call content types, grounded on linguistic theories of text types, in particular on the framework proposed by Werlich in 1976. We make available a newly annotated corpus of 279 documents (for a total of more than 180,000 tokens) belonging to different genres and temporal periods, based on a dedicated annotation scheme. We obtain an average Cohen’s kappa of 0.89 at token level. We achieve an average F1 score of 74.99% on the automatic classification of content types using a bi-LSTM model. Similar results are obtained on contemporary and historical documents, while performances on genres are more varied. This work promotes a discourse-oriented approach to information extraction and cross-fertilisation across disciplines through a computationally-aided linguistic analysis
- …