Search CORE

6 research outputs found

Sieve-based relation extraction of gene regulatory networks from biological literature

Author: A Franceschini
A Koike
A Mitchell
A Yates
AP Davis
BA Traag
Blaž Zupan
C Giuliano
C Nédellec
CH Wei
D Freitag
D Higgins
H Lee
H Liu
H Polen
HM Müller
IS Peter
J Amberger
J Björne
J Errington
J Kim
J Makhoul
JD Kim
JD Lafferty
JD Osborne
K Hakala
LA Ramshaw
LT MacNeil
M Ashburner
M Banko
M Bansal
M Garcia
M Krallinger
M Kwak
M Schmalisch
Marinka Žitnik
Marko Bajec
N Kambhatla
R Bossy
RC Bunescu
RM Piro
S Brin
S Pyysalo
S Pyysalo
S Sarawagi
S Van Landeghem
S Zitnik
S Žitnik
Slavko Žitnik
T Cohn
T Provoost
T Sauka-Spengler
T Wang
V Claveau
Y Li
Y Moreau
Y Xu
Z Xiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

The CHEMDNER corpus of chemicals and drugs and its annotation principles

Author: Akhondi S.A. (Saber A.)
Alves R. (Rui)
An X. (Xin)
Ata C. (Caglar)
Bajec M. (Marko)
Batista-Navarro R.T. (Riza Theresa)
Campos D. (David)
Can T. (Tolga)
Choi M. (Miji)
Couto F.M. (Francisco M.)
Dai H.J (Hong-Jie)
Dieb T.M. (Thaer M.)
Ekbal A. (Asif)
Giles C.L. (C. Lee)
Huber T. (Torsten)
Irmer M. (Matthias)
Ji D. (Donghong)
Khabsa M. (Madian)
Kors J.A. (Jan A.)
Krallinger M. (Martin)
Lamurias A. (Andre)
Leaman R. (Robert)
Leitner F. (Florian)
Liu H. (Hongfang)
Lowe D.M. (Daniel M.)
Lu Y. (Yanan)
Lu Z. (Zhiyong)
Martínez P. (Paloma)
Matos S. (Sérgio)
Munkhdalai T. (Tsendsuren)
Nathan S. (Senthil)
Oyarzabal J. (Julen)
Rabal O. (Obdulia)
Rak R. (Rafal)
Ramanan S.V. (S.V.)
Ravikumar K.E. (Komandur Elayavilli)
Rocktäschel T. (Tim)
Ryu K.H. (Keun Ho)
Salgado D. (David)
Sayle R.A. (Roger A.)
Segura-Bedmar I. (Isabel)
Sikdar U.K. (Utpal Kumar)
Tang B. (Buzhou)
Tzong-Han-Tsai R. (Richard)
Usié A. (Anabel)
Valencia A. (Alfonso)
Vazquez M. (Miguel)
Verspoor K. (Karin)
Weber L. (Lutz)
Xu H. (Hua)
Xu S. (Shuo)
Yoshioka M. (Masaharu)
Zitnik S. (Slavko)
Publication venue: Chemistry Central
Publication date: 01/01/2015
Field of study

The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus

Universidad de Navarra

Erasmus University Digital Repository

Dadun, University of Navarra

Automatic analysis of online conversations as processes

Author: Compagno Dario
Deneckere Rebecca
Salinesi Camille
Viorica Epure Elena
Zitnik Slavko
Publication venue: HAL CCSD
Publication date: 03/05/2017
Field of study

International audienceThe tremendous use of social media has changed the way society communicates and interacts nowadays, leading to a plethora of online conversations (Perrin et al., 2017). The increasing availability of these conversations as behavioral traces has enabled automatic approaches for behavior discovery and analysis. These approaches, grounded in machine learning, data mining and language processing have become effective predictive components and intelligent descriptive tools for many domains

HAL Descartes

HAL-Paris1

Hal-Diderot

Process Models of Interrelated Speech Intentions from Online Health-related Conversations

Author: Bajec Marko
Compagno Dario
Deneckere Rebecca
Epure Elena
Salinesi Camille
Zitnik Slavko
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

International audienceBeing related to the adoption of new beliefs, attitudes and, ultimately, behaviors, analyzing online communication is of utmost importance for medicine. Multiple health care, academic communities, such as information seeking and dissemination and persuasive technologies, acknowledge this need. However, in order to obtain understanding, a relevant way to model online communication for the study of behavior is required. In this paper, we propose an automatic method to reveal process models of interrelated speech intentions from conversations. Specifically, a domain-independent taxonomy of speech intentions is adopted, an annotated corpus of Reddit conversations is released, supervised classifiers for speech intention prediction from utterances are trained and assessed using 10-fold cross validation (multi-class, one-versus-all and multi-label setups) and an approach to transform conversations into well-defined, representative logs of verbal behavior, needed by process mining techniques, is designed. The experimental results show that: 1) the automatic classification of intentions is feasible (with Kappa scores varying between 0.52 and 1); 2) predicting pairs of intentions, also known as adjacency pairs, or including more utterances from even other heterogeneous corpora can improve the predictions of some classes; and 3) the classifiers in the current state are robust to be used on other corpora, although the results are poorer and suggest that the input corpus may not suciently capture varied ways of expressing certain intentions. The extracted process models of interrelated speech intentions open new views on grasping the formation of beliefs and behavioral intentions in and from speech, but in-depth evaluation of these models is further required

HAL-Paris1

The CHEMDNER corpus of chemicals and drugs and its annotation principles

Author: Akhondi Saber A.
Alves Rui
An Xin
Ata Caglar
Bajec Marko
Batista-Navarro Riza Theresa
Campos David
Can Tolga
Choi Miji
Couto Francisco M.
Dai Hong-Jie
Dieb Thaer M.
Ekbal Asif
Giles C. Lee
Huber Torsten
Irmer Matthias
Ji Donghong
Khabsa Madian
Kors Jan A.
Krallinger Martin
Lamurias Andre
Leaman Robert
Leitner Florian
Liu Hongfang
Lowe Daniel M.
Lu Yanan
Lu Zhiyong
Martinez Paloma
Matos Serergio
Munkhdalai Tsendsuren
Nathan Senthil
Oyarzabal Julen
Rabal Obdulia
Rak Rafal
Ramanan S. V.
Ravikumar Komandur Elayavilli
Rocktaschel Tim
Ryu Keun Ho
Salgado David
Sayle Roger A.
Segura-Bedmar Isabel
Sikdar Utpal Kumar
Tang Buzhou
Tsai Richard Tzong-Han
Usie Anabel
Valencia Alfonso
Vazquez Miguel
Verspoor Karin
Weber Lutz
Xu Hua
Xu Shuo
Yoshioka Masaharu
Zitnik Slavko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Springer - Publisher Connector

PubMed Central

Repositori Obert UdL

OpenMETU (Middle East Technical University)

Archivo Digital UPM