Search CORE

22 research outputs found

Sortal anaphora resolution to enhance relation extraction from biomedical literature

Author: A Haghighi
A Rahman
AR Aronson
AR Aronson
AT McCray
BJ Grosz
C Gasperin
CD Manning
CM Miller
D Hristovski
D Weissenbacher
E Hovy
G Hripscak
G Rosemblat
Graciela Rosemblat
H Kilicoglu
H Kilicoglu
H Kilicoglu
H Kilicoglu
H Kilicoglu
H Kilicoglu
H Lee
Halil Kilicoglu
I Segura-Bedmar
J Castaño
J Cohen
J D’Souza
J Zheng
JD Kim
JJ Kim
K Yoshikawa
KB Cohen
LH Smith
M Choi
M Miwa
M Torii
Marcelo Fiszman
NLT Nguyen
O Bodenreider
P Stenetorp
P Thompson
S Bergsma
S Lappin
S Pradhan
T Lavergne
TC Rindflesch
Thomas C. Rindflesch
V Ng
V Ng
WM Soon
X Yang
Y Kim
Y Xu
Ö Uzuner
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Networks of neuroinjury semantic predications to identify biomarkers for mild traumatic brain injury

Author: Han Zhang
Marcelo Fiszman
Michael J Cairelli
Thomas C Rindflesch
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Springer - Publisher Connector

Doctor of Philosophy

Author: Workman Terri Elizabeth
Publication venue: University of Utah
Publication date: 01/12/2011
Field of study

dissertationThe objective of this work is to examine the efficacy of natural language processing (NLP) in summarizing bibliographic text for multiple purposes. Researchers have noted the accelerating growth of bibliographic databases. Information seekers using traditional information retrieval techniques when searching large bibliographic databases are often overwhelmed by excessive, irrelevant data. Scientists have applied natural language processing technologies to improve retrieval. Text summarization, a natural language processing approach, simplifies bibliographic data while filtering it to address a user's need. Traditional text summarization can necessitate the use of multiple software applications to accommodate diverse processing refinements known as "points-of-view." A new, statistical approach to text summarization can transform this process. Combo, a statistical algorithm comprised of three individual metrics, determines which elements within input data are relevant to a user's specified information need, thus enabling a single software application to summarize text for many points-of-view. In this dissertation, I describe this algorithm, and the research process used in developing and testing it. Four studies comprised the research process. The goal of the first study was to create a conventional schema accommodating a genetic disease etiology point-of-view, and an evaluative reference standard. This was accomplished through simulating the task of secondary genetic database curation. The second study addressed the development iv and initial evaluation of the algorithm, comparing its performance to the conventional schema using the previously established reference standard, again within the task of secondary genetic database curation. The third and fourth studies evaluated the algorithm's performance in accommodating additional points-of-view in a simulated clinical decision support task. The third study explored prevention, while the fourth evaluated performance for prevention and drug treatment, comparing results to a conventional treatment schema's output. Both summarization methods identified data that were salient to their tasks. The conventional genetic disease etiology and treatment schemas located salient information for database curation and decision support, respectively. The Combo algorithm located salient genetic disease etiology, treatment, and prevention data, for the associated tasks. Dynamic text summarization could potentially serve additional purposes, such as consumer health information delivery, systematic review creation, and primary research. This technology may benefit many user groups

The University of Utah: J. Willard Marriott Digital Library

Automatic Identification of Interestingness in Biomedical Literature

Author: Anand Gaurish
Publication venue: CORE Scholar
Publication date: 01/01/2014
Field of study

This thesis presents research on automatically identifying interestingness in a graph of semantic predications. Interestingness represents a subjective quality of information that represents its value in meeting a user\u27s known or unknown retrieval needs. The perception of information as interesting requires a level of utility for the user as well as a balance between significant novelty and sufficient familiarity. It can also be influenced by additional factors such as unexpectedness or serendipity with recent experiences. The ability to identify interesting information facilitates the development of user-centered retrieval, especially in information semantic summarization and iterative, step-wise searching such as in discovery browsing systems. Ultimately, this allows biomedical researchers to more quickly identify information of greatest potential interest to them, whether expected or, perhaps more importantly, unexpected. Current discovery browsing systems use iterative information retrieval to discover new knowledge - a process that requires finding relevant co-occurring topics and relationships through consistent human involvement to identify interesting concepts. Although interestingness is subjective, this thesis identifies computable quantities in semantic data that correlate to interestingness in user searches. We compare several statistical and rule-based models correlating graph data extracted from semantic predications with concept interestingness as demonstrated in PubMed queries. Semantic predications represent scientific assertions extracted from all of the biomedical literature contained in the MEDLINE database. They are of the form, subject-predicate-object . Predications can easily be represented as graphs, where subjects and objects are nodes and predicates form edges. A graph of predications represents the assertions made in the citations from which the predications were extracted. This thesis uses graph metrics to identify features from the predication graph for model generation. These features are based on degree centrality (connectedness) of the seed concept node and surrounding nodes; they are also based on frequency of occurrence measures of the edges between the seed concept and surrounding nodes as well as between the nodes surrounding the seed concept and the neighbors of those nodes. A PubMed query log is used for training and testing models for interestingness. This log contains a set of user searches over a 24-hour period, and we make the assumption that co-occurrence of concepts with the seed concept in searches demonstrates interestingness of that concept with regard to the seed concept. Graph generation begins by the selection of a set of all predications containing the seed concept from the Semantic Medline database (our training dataset uses Alzheimer\u27s disease as the seed concept). The graph is built with the seed concept as the central node. Additional nodes are added for each concept that occurs with the seed concept in the initial predications and an edge is created for each instance of a predication containing the two concepts. The edges are labeled with the specific predicate in the predication. This graph is extended to include additional nodes within two leaps from the seed concept. The concepts in the PubMed query logs are normalized to UMLS concepts or Entrez Gene symbols using MetaMap. Token-based and user-based counts are collected for each co-occurring term. These measures are combined to create a weighted score which is used to determine three potential thresholds of interestingness based on deviation from the mean score. The concepts that are included in both the graph and the normalized log data are identified for use in model training and testing

OhioLINK Electronic Thesis and Dissertation Center

CORE

Semantic Approaches for Knowledge Discovery and Retrieval in Biomedicine

Author: Wilkowski Bartlomiej
Publication venue: Technical University of Denmark
Publication date: 01/01/2011
Field of study

Online Research Database In Technology

The Detection of Contradictory Claims in Biomedical Abstracts

Author: Alamri Abdulaziz
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 21/12/2016
Field of study

Research claims in the biomedical domain are not always consistent, and may even be contradictory. This thesis explores contradictions between research claims in order to determine whether or not it is possible to develop a solution to automate the detection of such phenomena. Such a solution will help decision-makers, including researchers, to alleviate the effects of contradictory claims on their decisions. This study develops two methodologies to construct corpora of contradictions. The first methodology utilises systematic reviews to construct a manually-annotated corpus of contradictions. The second methodology uses a different approach to construct a corpus of contradictions which does not rely on human annotation. This methodology is proposed to overcome the limitations of the manual annotation approach. Moreover, this thesis proposes a pipeline to detect contradictions in abstracts. The pipeline takes a question and a list of research abstracts which may contain answers to it. The output of the pipeline is a list of sentences extracted from abstracts which answer the question, where each sentence is annotated with an assertion value with respect to the question. Claims which feature opposing assertion values are considered as potentially contradictory claims. The research demonstrates that automating the detection of contradictory claims in research abstracts is a feasible problem

White Rose E-theses Online

OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression

Abstract Background Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering. Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at <url>http://bionlp.sourceforge.net/</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Linking Clinical Records to the Biomedical Literature

Author: Alnazzawi Noha
Publication venue
Publication date: 31/12/2016
Field of study

The University of Manchester - Institutional Repository

31th International Conference on Information Modelling and Knowledge Bases

Author
Publication venue: Universitatsbibliothek Kiel
Publication date: 01/01/2021
Field of study

Information modelling is becoming more and more important topic for researchers, designers, and users of information systems.The amount and complexity of information itself, the number of abstractionlevels of information, and the size of databases and knowledge bases arecontinuously growing. Conceptual modelling is one of the sub-areas ofinformation modelling. The aim of this conference is to bring together experts from different areas of computer science and other disciplines, who have a common interest in understanding and solving problems on information modelling and knowledge bases, as well as applying the results of research to practice. We also aim to recognize and study new areas on modelling and knowledge bases to which more attention should be paid. Therefore philosophy and logic, cognitive science, knowledge management, linguistics and management science are relevant areas, too. In the conference, there will be three categories of presentations, i.e. full papers, short papers and position papers

MACAU: Open Access Repository of Kiel University