Skip to main content
Article thumbnail
Location of Repository

Testing Extensive Use of NER tools in Article Classification and a Statistical Approach for Method Interaction Extraction in the Protein-Protein Interaction Literature

By Anália Lourenço, Michael Conover, Andrew Wong, Fengxia Pan, Alaa Abi-haidar, Azadeh Nematzadeh, Hagit Shatkay and Luis M. Rocha and Luis M. Rocha


We participated (as Team 81) in the Article Classification (ACT) and Interaction Method (IMT) subtasks of the Protein-Protein Interaction task of the Biocreative III Challenge. For the ACT we pursued an extensive testing of available Named Entity Recognition (NER) tools, and used the most promising ones to extend our the Variable Trigonometric Threshold (VTT) linear classifier we successfully used in BioCreative II and II.5. Our main goal was to exploit the power of available NER tools to aid in the document classification of documents relevant for Protein-Protein Interaction. We also used a Support Vector Machine Classifier on NER features for comparison purposes. For the IMT, we experimented with a primarily statistical approach, as opposed to a deeper natural language processing strategy; in a nutshell, we exploited classifiers, simple pattern matching, and ranking of candidate matches using statistical considerations. We will also report on our efforts to integrate our IMT method sentence classifier into our ACT pipeline. Article Classification Task We participated in both the online submission with our own annotation server implementing the VTT algorithm via the BioCreative MetaServe

Year: 2011
OAI identifier: oai:CiteSeerX.psu:
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.