Search CORE

444 research outputs found

Discovering context-specific relationships from biological literature by using multi-level context terms

Author: A Sehgal
AP Davis
AZ Ijaz
BL Humphreys
Doheon Lee
DR Swanson
I Petric
Jaejoon Choi
JD Kim
JL Myers
JR Finkel
K Hirohata
Kyunghyun Park
L Larbanoix
M Hewett
M Wang
M Weeber
M Weeber
Min Song
NC Baker
P Srinivasan
P Srinivasan
PL McClean
R Sultana
RA DiGiacomo
S Iivonen
Sejoon Lee
WM Pardridge
Y Huagn
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background The Swanson's ABC model is powerful to infer hidden relationships buried in biological literature. However, the model is inadequate to infer relations with context information. In addition, the model generates a very large amount of candidates from biological text, and it is a semi-automatic, labor-intensive technique requiring human expert's manual input. To tackle these problems, we incorporate context terms to infer relations between AB interactions and BC interactions. Methods We propose 3 steps to discover meaningful hidden relationships between drugs and diseases: 1) multi-level (gene, drug, disease, symptom) entity recognition, 2) interaction extraction (drug-gene, gene-disease) from literature, 3) context vector based similarity score calculation. Subsequently, we evaluate our hypothesis with the datasets of the "Alzheimer's disease" related 77,711 PubMed abstracts. As golden standards, PharmGKB and CTD databases are used. Evaluation is conducted in 2 ways: first, comparing precision of the proposed method and the previous method and second, analysing top 10 ranked results to examine whether highly ranked interactions are truly meaningful or not. Results The results indicate that context-based relation inference achieved better precision than the previous ABC model approach. The literature analysis also shows that interactions inferred by the context-based approach are more meaningful than interactions by the previous ABC model. Conclusions We propose a novel interaction inference technique that incorporates context term vectors into the ABC model to discover meaningful hidden relationships. By utilizing multi-level context terms, our model shows better performance than the previous ABC model.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Literature Based Discovery (LBD): Towards Hypothesis Generation and Knowledge Discovery in Biomedical Text Mining

Author: Bhasuran Balu
Murugesan Gurusamy
Natarajan Jeyakumar
Publication venue
Publication date: 03/10/2023
Field of study

Biomedical knowledge is growing in an astounding pace with a majority of this knowledge is represented as scientific publications. Text mining tools and methods represents automatic approaches for extracting hidden patterns and trends from this semi structured and unstructured data. In Biomedical Text mining, Literature Based Discovery (LBD) is the process of automatically discovering novel associations between medical terms otherwise mentioned in disjoint literature sets. LBD approaches proven to be successfully reducing the discovery time of potential associations that are hidden in the vast amount of scientific literature. The process focuses on creating concept profiles for medical terms such as a disease or symptom and connecting it with a drug and treatment based on the statistical significance of the shared profiles. This knowledge discovery approach introduced in 1989 still remains as a core task in text mining. Currently the ABC principle based two approaches namely open discovery and closed discovery are mostly explored in LBD process. This review starts with general introduction about text mining followed by biomedical text mining and introduces various literature resources such as MEDLINE, UMLS, MESH, and SemMedDB. This is followed by brief introduction of the core ABC principle and its associated two approaches open discovery and closed discovery in LBD process. This review also discusses the deep learning applications in LBD by reviewing the role of transformer models and neural networks based LBD models and its future aspects. Finally, reviews the key biomedical discoveries generated through LBD approaches in biomedicine and conclude with the current limitations and future directions of LBD.Comment: 43 Pages, 5 Figures, 4 Table

arXiv.org e-Print Archive

A HYBRID APPROACH FOR TRANSLATIONAL RESEARCH

Author: Webster Yue Wang
Publication venue
Publication date: 01/06/2010
Field of study

Indiana University-Purdue University Indianapolis (IUPUI)Translational research has proven to be a powerful process that bridges the gap between basic science and medical practice. The complexity of translational research is two-fold: integration of vast amount of information in disparate silos, and dissemination of discoveries to stakeholders with different interests. We designed and implemented a hybrid knowledge discovery framework. We developed strategies to leverage both traditional biomedical databases and Health Social Network Communities content in the discovery process. Heuristic and quantitative evaluations were carried out in Colorectal Cancer and Amyotrophic Lateral Sclerosis disease areas. The results demonstrate the potential of our approach to bridge silos and to identify hidden links among clinical observations, drugs, genes and diseases, which may eventually lead to the discovery of novel disease targets, biomarkers and therapies

IUPUIScholarWorks

Exploiting Latent Features of Text and Graphs

Author: Sybrandt Justin George
Publication venue: Clemson University Libraries
Publication date: 01/05/2020
Field of study

As the size and scope of online data continues to grow, new machine learning techniques become necessary to best capitalize on the wealth of available information. However, the models that help convert data into knowledge require nontrivial processes to make sense of large collections of text and massive online graphs. In both scenarios, modern machine learning pipelines produce embeddings --- semantically rich vectors of latent features --- to convert human constructs for machine understanding. In this dissertation we focus on information available within biomedical science, including human-written abstracts of scientific papers, as well as machine-generated graphs of biomedical entity relationships. We present the Moliere system, and our method for identifying new discoveries through the use of natural language processing and graph mining algorithms. We propose heuristically-based ranking criteria to augment Moliere, and leverage this ranking to identify a new gene-treatment target for HIV-associated Neurodegenerative Disorders. We additionally focus on the latent features of graphs, and propose a new bipartite graph embedding technique. Using our graph embedding, we advance the state-of-the-art in hypergraph partitioning quality. Having newfound intuition of graph embeddings, we present Agatha, a deep-learning approach to hypothesis generation. This system learns a data-driven ranking criteria derived from the embeddings of our large proposed biomedical semantic graph. To produce human-readable results, we additionally propose CBAG, a technique for conditional biomedical abstract generation

Clemson University: TigerPrints

Concept Based Knowledge Discovery from Biomedical Literature

Author: Radovanovic Aleksandar.
Publication venue: 'University of the Western Cape Library Service'
Publication date: 01/01/2009
Field of study

Philosophiae Doctor - PhDThis thesis describes and introduces novel methods for knowledge discovery and presents a software system that is able to extract information from biomedical literature, review interesting connections between various biomedical concepts and in so doing, generates new hypotheses. The experimental results obtained by using methods described in this thesis, are compared to currently published results obtained by other methods and a number of case studies are described. This thesis shows how the technology, resented can be integrated with the researchers own knowledge, experimentation and observations for optimal progression of scientific research.South Afric

UWC Theses and Dissertations

Integrated Bio-Entity Network: A System for Biological Knowledge Discovery

Author: A Ceol
A Chatr-aryamontri
A Coulet
A Grote
A Koike
A Mottaz
A Rzhetsky
A Yuryev
B Aranda
C Alfarano
C Blaschke
C Friedman
C Stark
CB Giles
CF Schaefer
D Barrell
D Hristovski
D Maglott
D Maglott
D Tikk
DR Swanson
EW Dijkstra
F Leitner
G Gonzalez
GR Mishra
H Liu
I Iossifov
I Vastrik
J Bjorne
JD Wren
Jinfeng Zhang
JO Korbel
Jun S. Liu
K Du
K Han
KD Pruitt
L Gong
L Salwinski
Lindsey Bell
LJ Jensen
LS Wong
M Ashburner
M Castagna
M Devignes
M Huang
M Kanehisa
M Krallinger
M Krallinger
M Kuhn
M Kuhn
M Yetisgen-Yildiz
MG Kann
N Daraselia
N Sierro
OL Griffith
P Pagel
P Shahi
P Srinivasan
QC Bui
QC Bui
R Apweiler
R Chowdhary
R Crnich
R Frijters
R Hoffmann
R Hoffmann
R Saetre
Rajesh Chowdhary
S Gama-Castro
S Mathivanan
S Naidu
S Yilmaz
T Beuming
TH Cormen
TS Keshava Prasad
V Matys
Xufeng Niu
Y Li
Y Wang
Ying Xu
Z Gao
Z Huang
Publication venue: Public Library of Science
Publication date: 27/06/2011
Field of study

A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein–protein interactions, protein/gene regulations, protein–small molecule interactions, protein–GO relationships, protein–pathway relationships, and pathway–disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses—the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Current Development of Open Election Data in Taiwan

Author: Chien-Chang Chen
Publication venue
Publication date
Field of study

[[notice]]補正完

Tamkang University Institutional Repository

Elucidation of functional consequences of signalling pathway interactions

Author: Ihekwaba Adaoha EC
Nguyen Phuong T
Priami Corrado
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background A great deal of data has accumulated on signalling pathways. These large datasets are thought to contain much implicit information on their molecular structure, interaction and activity information, which provides a picture of intricate molecular networks believed to underlie biological functions. While tremendous advances have been made in trying to understand these systems, how information is transmitted within them is still poorly understood. This ever growing amount of data demands we adopt powerful computational techniques that will play a pivotal role in the conversion of mined data to knowledge, and in elucidating the topological and functional properties of protein - protein interactions. Results A computational framework is presented which allows for the description of embedded networks, and identification of common shared components thought to assist in the transmission of information within the systems studied. By employing the graph theories of network biology - such as degree distribution, clustering coefficient, vertex betweenness and shortest path measures - topological features of protein-protein interactions for published datasets of the p53, nuclear factor kappa B (NF-κB) and G1/S phase of the cell cycle systems were ascertained. Highly ranked nodes which in some cases were identified as connecting proteins most likely responsible for propagation of transduction signals across the networks were determined. The functional consequences of these nodes in the context of their network environment were also determined. These findings highlight the usefulness of the framework in identifying possible combination or links as targets for therapeutic responses; and put forward the idea of using retrieved knowledge on the shared components in constructing better organised and structured models of signalling networks. Conclusion It is hoped that through the data mined reconstructed signal transduction networks, well developed models of the published data can be built which in the end would guide the prediction of new targets based on the pathway's environment for further analysis. Source code is available upon request.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio della Ricerca - Università di Pisa

Open Repository and Bibliography - Luxembourg