Search CORE

14 research outputs found

Semi-automatic conversion of BioProp semantic annotation to PASBio annotation

Author: AL Berger
B Santorini
C Warner
Chi-Hsin Huang
D Dowty
E Charniak
H-J Dai
Hong-Jie Dai
KB Cohen
M Collins
M Palmer
O Babko-Malaya
O Babko-Malaya
PK Shah
PK Shah
R Hoernig
RA Hudson
Richard Tzong-Han Tsai
RT-H Tsai
RT-H Tsai
S Pradhan
T Wattarujeekrit
V Punyakanok
W-C Chou
Wen-Lian Hsu
X Carreras
X Carreras
Y Kogan
Y Tateisi
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Semantic role labeling (SRL) is an important text analysis technique. In SRL, sentences are represented by one or more predicate-argument structures (PAS). Each PAS is composed of a predicate (verb) and several arguments (noun phrases, adverbial phrases, etc.) with different semantic roles, including main arguments (agent or patient) as well as adjunct arguments (time, manner, or location). PropBank is the most widely used PAS corpus and annotation format in the newswire domain. In the biomedical field, however, more detailed and restrictive PAS annotation formats such as PASBio are popular. Unfortunately, due to the lack of an annotated PASBio corpus, no publicly available machine-learning (ML) based SRL systems based on PASBio have been developed. In previous work, we constructed a biomedical corpus based on the PropBank standard called BioProp, on which we developed an ML-based SRL system, BIOSMILE. In this paper, we aim to build a system to convert BIOSMILE's BioProp annotation output to PASBio annotation. Our system consists of BIOSMILE in combination with a BioProp-PASBio rule-based converter, and an additional semi-automatic rule generator. Results Our first experiment evaluated our rule-based converter's performance independently from BIOSMILE performance. The converter achieved an F-score of 85.29%. The second experiment evaluated combined system (BIOSMILE + rule-based converter). The system achieved an F-score of 69.08% for PASBio's 29 verbs. Conclusion Our approach allows PAS conversion between BioProp and PASBio annotation using BIOSMILE alongside our newly developed semi-automatic rule generator and rule-based converter. Our system can match the performance of other state-of-the-art domain-specific ML-based SRL systems and can be easily customized for PASBio application development.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Entity linking for biomedical literature

Author: AB Abacha
B Liu
Boliang Zhang
CE Shannon
D Milne
Daniel Howsmon
Deborah McGuinness
H Fang
H Ji
H Ji
Heng Ji
HJ Dai
J Biesiada
J Zheng
James Hendler
Jin G Zheng
Juergen Hahn
L Hirschman
L Hunter
L Page
L Ratinov
LM Akella
M Frisch
M Miwa
M Pennacchiotti
NDB Bruce
P Ferragina
PN Mendes
R Mihalcea
S Cucerzan
S Kulkarni
T Cassidy
T Lipniacki
V Punyakanok
W Shen
X Cheng
X Liu
Y Guo
Y Sun
Y Usami
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

CRF Models for Tamil Part of Speech Tagging and Chunking

Author: D.M. Bikel
L.R. Rabiner
R. Garside
V. Punyakanok
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

Semantic Role Labeling of Speech Transcripts

Author: D. Gildea
L. Màrquez
M. Palmer
O. Kolomiyets
V. Punyakanok
X. Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Crossref

TDC: Typed Dependencies-Based Chunking Model

Author: F Sebastiani
J Kivinen
SP Abney
T Zhang
V Punyakanok
Y-C Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Applying System Combination to Base Noun Phrase Identification

Author: Daelemans W.
Dejean H.
Koeling R.
Krymolowski Y.
Punyakanok V.
Roth D.
Tjong Kim Sang E.F.
Publication venue: Morgan Kaufman Publishers
Publication date: 01/01/2000
Field of study

Tilburg University Repository

Automatic Crime Prediction Using Events Extracted from Twitter Posts

Author: D. Blei
D. Gildea
D.M. Blei
G.O. Mohler
L. Màrquez
M. Gerber
S. Chainey
V. Punyakanok
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Crossref

Aspect-Object Alignment Using Integer Linear Programming

Author: A. Mukherjee
G. Qiu
H. Nishikawa
J. Yu
J. Zhao
N. Jakob
V. Punyakanok
W.M. Soon
X. Ding
X. Ding
Y. Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Experiments on the Identification of Predicate-Argument Structure in Polish

Author: A. Patejuk
A. Przepiórkowski
A. Radziszewski
A. Wróblewska
F. Pedregosa
J. Semeckỳ
K. Gołuchowski
M. Lopatková
M. Woliński
M. Świdziński
V. Punyakanok
W. Sun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

FacetGist

Author: Fader A.
Gupta S.
I. G. Council
Kozareva Z.
Lin T.
Nakashole N.
Punyakanok V.
Shen D.
Shi S.
Talukdar P. P.
Tateisi Y.
Zhou D.
Zhu X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/10/2016
Field of study

Given the large volume of technical documents available, it is crucial to automatically organize and categorize these documents to be able to understand and extract value from them. Towards this end, we introduce a new research problem called Facet Extraction. Given a collection of technical documents, the goal of Facet Extraction is to automatically label each document with a set of concepts for the key facets (e.g., application, technique, evaluation metrics, and dataset) that people may be interested in. Facet Extraction has numerous applications, including document summarization, literature search, patent search and business intelligence. The major challenge in performing Facet Extraction arises from multiple sources: concept extraction, concept to facet matching, and facet disambiguation. To tackle these challenges, we develop FacetGist, a framework for facet extraction. Facet Extraction involves constructing a graph-based heterogeneous network to capture information available across multiple local sentence-level features, as well as global context features. We then formulate a joint optimization problem, and propose an efficient algorithm for graph-based label propagation to estimate the facet of each concept mention. Experimental results on technical corpora from two domains demonstrate that Facet Extraction can lead to an improvement of over 25% in both precision and recall over competing schemes

Crossref

eScholarship - University of California