Search CORE

5,573 research outputs found

Adapting a relation extraction pipeline for the BioCreAtIvE II task

Author: Grover Claire
Haddow Barry
Klein Ewan
Matthews Michael
Nielsen Leif Arda
Tobin Richard
Wang Xinglong
Publication venue
Publication date: 01/01/2007
Field of study

Concept graphs: Applications to biomedical text categorization and concept extraction

Author: Bleik Said
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2013
Field of study

As science advances, the underlying literature grows rapidly providing valuable knowledge mines for researchers and practitioners. The text content that makes up these knowledge collections is often unstructured and, thus, extracting relevant or novel information could be nontrivial and costly. In addition, human knowledge and expertise are being transformed into structured digital information in the form of vocabulary databases and ontologies. These knowledge bases hold substantial hierarchical and semantic relationships of common domain concepts. Consequently, automating learning tasks could be reinforced with those knowledge bases through constructing human-like representations of knowledge. This allows developing algorithms that simulate the human reasoning tasks of content perception, concept identification, and classification. This study explores the representation of text documents using concept graphs that are constructed with the help of a domain ontology. In particular, the target data sets are collections of biomedical text documents, and the domain ontology is a collection of predefined biomedical concepts and relationships among them. The proposed representation preserves those relationships and allows using the structural features of graphs in text mining and learning algorithms. Those features emphasize the significance of the underlying relationship information that exists in the text content behind the interrelated topics and concepts of a text document. The experiments presented in this study include text categorization and concept extraction applied on biomedical data sets. The experimental results demonstrate how the relationships extracted from text and captured in graph structures can be used to improve the performance of the aforementioned applications. The discussed techniques can be used in creating and maintaining digital libraries through enhancing indexing, retrieval, and management of documents as well as in a broad range of domain-specific applications such as drug discovery, hypothesis generation, and the analysis of molecular structures in chemoinformatics

Digital Commons @ New Jersey Institute of Technology (NJIT)

Exploiting and integrating rich features for biological literature classification

Author: Ding Shilin
Huang Minlie
Wang Hongning
Zhu Xiaoyan
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Efficient features play an important role in automated text classification, which definitely facilitates the access of large-scale data. In the bioscience field, biological structures and terminologies are described by a large number of features; domain dependent features would significantly improve the classification performance. How to effectively select and integrate different types of features to improve the biological literature classification performance is the major issue studied in this paper. Results To efficiently classify the biological literatures, we propose a novel feature value schema <it>TF</it>*<it>ML</it>, features covering from lower level domain independent “string feature” to higher level domain dependent “semantic template feature”, and proper integrations among the features. Compared to our previous approaches, the performance is improved in terms of <it>AUC</it> and <it>F-Score</it> by 11.5% and 8.8% respectively, and outperforms the best performance achieved in BioCreAtIvE 2006. Conclusions Different types of features possess different discriminative capabilities in literature classification; proper integration of domain independent and dependent features would significantly improve the performance and overcome the over-fitting on data distribution.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Artificial intelligence for ocean science data integration:current state, gaps, and way forward

Author: Bar Koby
Lehahn Yoav
Sagi Tomer
Publication venue: 'University of California Press'
Publication date: 15/05/2020
Field of study

VBN

USING MACHINE LEARNING ALGORITHMS FOR CLASSIFYING NON-FUNCTIONAL REQUIREMENTS - RESEARCH AND EVALUATION

Author: Binkhonain Manal
Publication venue
Publication date: 31/12/2021
Field of study

The University of Manchester - Institutional Repository