Search CORE

1,131 research outputs found

Knowledge-based Biomedical Data Science 2019

Author: Callahan Tiffany J.
Hunter Lawrence E.
Pielke-Lombardo Harrison
Tripodi Ignacio J.
Publication venue
Publication date: 08/10/2019
Field of study

Knowledge-based biomedical data science (KBDS) involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey the progress in the last year in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing, and the expansion of knowledge-based approaches to novel domains, such as Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages with 3 table

arXiv.org e-Print Archive

Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review

Author: Dudley Joel T
Lavelli Alberto
Miotto Riccardo
Osmani Venet
Rinaldi Fabio
Sheikhalishahi Seyedmostafa
Publication venue
Publication date: 01/04/2019
Field of study

Novel approaches that complement and go beyond evidence-based medicine are required in the domain of chronic diseases, given the growing incidence of such conditions on the worldwide population. A promising avenue is the secondary use of electronic health records (EHRs), where patient data are analyzed to conduct clinical and translational research. Methods based on machine learning to process EHRs are resulting in improved understanding of patient clinical trajectories and chronic disease risk prediction, creating a unique opportunity to derive previously unknown clinical insights. However, a wealth of clinical histories remains locked behind clinical narratives in free-form text. Consequently, unlocking the full potential of EHR data is contingent on the development of natural language processing (NLP) methods to automatically transform clinical text into structured clinical data that can guide clinical decisions and potentially delay or prevent disease onset

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

ZORA

Integrating Medical Ontology and Pseudo Relevance Feedback For Medical Document Retrieval

Author: Ghoddousi Andia
Publication venue
Publication date: 20/09/2016
Field of study

The purpose of this thesis is to undertake and improve the accuracy of locating the relevant documents from a large amount of Electronic Medical Data (EMD). The unique goal of this research is to propose a new idea for using medical ontology to find an easy and more reliable approach for patients to have a better understanding of their diseases and also help doctors to find and further improve the possible methods of diagnosis and treatments. The empirical studies were based on the dataset provided by CLEF focused on health care data. In this research, I have used Information Retrieval to find and obtain relevant information within the large amount of data sets provided by CLEF. I then used ranking functionality on the Terrier platform to calculate and evaluate the matching documents in the collection of data sets. BM25 was used as the base normalization method to retrieve the results and Pseudo Relevance Feedback weighting model to retrieve the information regarding patients health history and medical records in order to find more accurate results. I then used Unified Medical Language System to develop indexing of the queries while searching on the Internet and looking for health related documents. UMLS software was actually used to link the computer system with the health and biomedical terms and vocabularies into classify tools; it works as a dictionary for the patients by translating the medical terms. Later I would like to work on using medical ontology to create a relationship between the documents regarding the medical data and my retrieved results

YorkSpace

Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases

Author: AA Morgan
AC Nicholson
AJ Perez
Andrey Rzhetsky
AP Weetman
B Dell'Osso
B Rapoport
B Vaidya
BA Imhof
BT Alako
C Blaschke
C Nielsen
C Puozzo
CJ McDougle
CR Faltynek
D Chaussabel
D Denys
D Hristovski
D Olive
D Shao
DB Kell
DR Swanson
DR Swanson
E Yung
EC Butcher
EC Butcher
GR Hajer
H Kakeya
H Shatkay
HP Fischer
I Kola
J Han
J Kuhlmann
JA Wagner
Jacob de Vlieg
JD Wren
JD Wren
K Kajinami
K Miguita
K Njung'e
K Tomiyama
K Vandenborre
L Prokunina
LJ Jensen
M Briley
M Briley
M Campillos
M Hayashi
M Imoto
M Inazu
M Kamata
M Sugiyama
M Yetisgen-Yildiz
MA Andrade
MA Andrade
Marianne van Vugt
N Daraselia
NR Smalheiser
PD Pelton
PR Newby
R Frijters
R Frijters
R Frijters
R Homayouni
R Jelier
RA DiGiacomo
Raoul Frijters
René van Schaik
Ruben Smeets
RY Mukhtar
S Gordon
S Morikawa
S Raychaudhuri
S Raychaudhuri
SN Vaishnavi
SS Fuller
T Fawcett
T Hiramatsu
T Ito
T Shokawa
T Tabata
TK Jenssen
TT Ashburn
U Kaneyuki
WA Colburn
WK Goodman
Wynand Alkema
Y Ichimaru
Y Sugimoto
Y Tamori
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Radboud Repository

CoPub Mapper: mining MEDLINE based on search term co-publication

Author: Alako Blaise TF
Jelier Rob
Jenster Guido
Polman Jan
Rullmann Ton
van Baal Sjozef
Veldhoven Antoine
Verhoeven Stefan
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: High throughput microarray analyses result in many differentially expressed genes that are potentially responsible for the biological process of interest. In order to identify biological similarities between genes, publications from MEDLINE were identified in which pairs of gene names and combinations of gene name with specific keywords were co-mentioned. RESULTS: MEDLINE search strings for 15,621 known genes and 3,731 keywords were generated and validated. PubMed IDs were retrieved from MEDLINE and relative probability of co-occurrences of all gene-gene and gene-keyword pairs determined. To assess gene clustering according to literature co-publication, 150 genes consisting of 8 sets with known connections (same pathway, same protein complex, or same cellular localization, etc.) were run through the program. Receiver operator characteristics (ROC) analyses showed that most gene sets were clustered much better than expected by random chance. To test grouping of genes from real microarray data, 221 differentially expressed genes from a microarray experiment were analyzed with CoPub Mapper, which resulted in several relevant clusters of genes with biological process and disease keywords. In addition, all genes versus keywords were hierarchical clustered to reveal a complete grouping of published genes based on co-occurrence. CONCLUSION: The CoPub Mapper program allows for quick and versatile querying of co-published genes and keywords and can be successfully used to cluster predefined groups of genes and microarray data

Lirias

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

EUR Research Repository

Wageningen University & Research Publications

Erasmus University Digital Repository

Retrieval of Sibling Studies for Clinical Randomised Control Trials

Author: Hamad Faten Fatehi
Publication venue
Publication date: 29/04/2013
Field of study

Aberystwyth Research Portal

Linking Clinical Records to the Biomedical Literature

Author: Alnazzawi Noha
Publication venue
Publication date: 31/12/2016
Field of study

The University of Manchester - Institutional Repository

Nutritional Systems Biology

Author: Jensen Kasper
Publication venue: Technical University of Denmark
Publication date: 01/01/2014
Field of study

Online Research Database In Technology

Query-Constraint-Based Mining of Association Rules for Exploratory Analysis of Clinical Datasets in the National Sleep Research Resource

Author: Abeysinghe Rashmie
Cui Licong
Publication venue: UKnowledge
Publication date: 23/07/2018
Field of study

Background: Association Rule Mining (ARM) has been widely used by biomedical researchers to perform exploratory data analysis and uncover potential relationships among variables in biomedical datasets. However, when biomedical datasets are high-dimensional, performing ARM on such datasets will yield a large number of rules, many of which may be uninteresting. Especially for imbalanced datasets, performing ARM directly would result in uninteresting rules that are dominated by certain variables that capture general characteristics. Methods: We introduce a query-constraint-based ARM (QARM) approach for exploratory analysis of multiple, diverse clinical datasets in the National Sleep Research Resource (NSRR). QARM enables rule mining on a subset of data items satisfying a query constraint. We first perform a series of data-preprocessing steps including variable selection, merging semantically similar variables, combining multiple-visit data, and data transformation. We use Top-k Non-Redundant (TNR) ARM algorithm to generate association rules. Then we remove general and subsumed rules so that unique and non-redundant rules are resulted for a particular query constraint. Results: Applying QARM on five datasets from NSRR obtained a total of 2517 association rules with a minimum confidence of 60% (using top 100 rules for each query constraint). The results show that merging similar variables could avoid uninteresting rules. Also, removing general and subsumed rules resulted in a more concise and interesting set of rules. Conclusions: QARM shows the potential to support exploratory analysis of large biomedical datasets. It is also shown as a useful method to reduce the number of uninteresting association rules generated from imbalanced datasets. A preliminary literature-based analysis showed that some association rules have supporting evidence from biomedical literature, while others without literature-based evidence may serve as the candidates for new hypotheses to explore and investigate. Together with literature-based evidence, the association rules mined over the NSRR clinical datasets may be used to support clinical decisions for sleep-related problems

University of Kentucky

Citationally Enhanced Semantic Literature Based Discovery

Author: Fleig John David
Publication venue: NSUWorks
Publication date: 01/01/2019
Field of study

We are living within the age of information. The ever increasing flow of data and publications poses a monumental bottleneck to scientific progress as despite the amazing abilities of the human mind, it is woefully inadequate in processing such a vast quantity of multidimensional information. The small bits of flotsam and jetsam that we leverage belies the amount of useful information beneath the surface. It is imperative that automated tools exist to better search, retrieve, and summarize this content. Combinations of document indexing and search engines can quickly find you a document whose content best matches your query - if the information is all contained within a single document. But it doesn’t draw connections, make hypotheses, or find knowledge hidden across multiple documents. Literature-based discovery is an approach that can uncover hidden interrelationships between topics by extracting information from existing published scientific literature. The proposed study utilizes a semantic-based approach that builds a graph of related concepts between two user specified sets of topics using semantic predications. In addition, the study includes properties of bibliographically related documents and statistical properties of concepts to further enhance the quality of the proposed intermediate terms. Our results show an improvement in precision-recall when incorporating citations

NSU Works