Search CORE

A context-blocks model for identifying clinical relationships in patient records

Author: A Névéol
A Roberts
AK McCallum
AM Cohen
AR Aronson
Aurélie Névéol
C Friedman
ES Chen
F Leitner
H Shatkay
H Xu
J Aberdeen
J Björne
J Lafferty
L Smith
L Tanabe
M Bundschus
M Craven
M Krallinger
N Ponomareva
O Uzuner
O Uzuner
R Harpaz
R Islamaj Doğan
R Islamaj Doğan
Rezarta Islamaj Doğan
SM Meystre
SV Pakhomov
TC Rindflesch
TC Rindflesch
X Wang
X Wang
X Wang
Zhiyong Lu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

GenCLiP: a software program for clustering gene lists by literature profiling and constructing gene co-occurrence networks related to custom keywords

Author: AA Schaffer
BT Alako
C Plake
C Rodriguez-Penagos
D Chaussabel
D Lee
EG Cerami
G Karakiulakis
H Kim
Hui-Yong Tian
Jin Zhao
K Fundel
Kai-Tai Yao
KJ Bussey
LJ Jensen
M Bundschus
M Suderman
MB Eisen
N Daraselia
P Shannon
R Hammamieh
R Hoffmann
R Rubinstein
RT Tsai
S Li
T Ide
TK Jenssen
VK Gajendran
Yi-Bo Zhou
Z Huang
ZF Hu
Zhen-Fu Hu
Zhong-Xi Huang
Publication venue: BioMed Central
Publication date: 01/07/2008
Field of study

Abstract Background Biomedical researchers often want to explore pathogenesis and pathways regulated by abnormally expressed genes, such as those identified by microarray analyses. Literature mining is an important way to assist in this task. Many literature mining tools are now available. However, few of them allows the user to make manual adjustments to zero in on what he/she wants to know in particular. Results We present our software program, GenCLiP (Gene Cluster with Literature Profiles), which is based on the methods presented by Chaussabel and Sher (<it>Genome Biol </it>2002, 3(10):RESEARCH0055) that search gene lists to identify functional clusters of genes based on up-to-date literature profiling. Four features were added to this previously described method: the ability to 1) manually curate keywords extracted from the literature, 2) search genes and gene co-occurrence networks related to custom keywords, 3) compare analyzed gene results with negative and positive controls generated by GenCLiP, and 4) calculate probabilities that the resulting genes and gene networks are randomly related. In this paper, we show with a set of differentially expressed genes between keloids and normal control, how implementation of functions in GenCLiP successfully identified keywords related to the pathogenesis of keloids and unknown gene pathways involved in the pathogenesis of keloids. Conclusion With regard to the identification of disease-susceptibility genes, GenCLiP allows one to quickly acquire a primary pathogenesis profile and identify pathways involving abnormally expressed genes not previously associated with the disease.</p

Directory of Open Access Journals

Large-scale directional relationship extraction and resolution

Author: A Culotta
A Gladki
A Koike
A Yuryev
AB Clegg
C Rodriguez-Penagos
CM Topinka
Cory B Giles
D Zhou
F Rinaldi
F Rinaldi
H Chen
H Jang
H Kim
I Donaldson
IK Ruf
J Ding
J Jiang
JA Mitchell
JC Park
JD Kim
JD Kim
JD Wren
JD Wren
JD Wren
Jonathan D Wren
JP Vaque
K Fundel
K Sagae
LM Juliano
M Bundschus
M Chagoyen
M Huang
M Lease
M Wang
M-C de Marneffe
N Daraselia
P Zweigenbaum
R Bunescu
R Kuffner
RC Bunescu
RT Tsai
S Kim
S Novichkova
TK Jenssen
W Pratt
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

HypertenGene: extracting key hypertension genes from biomedical literature with position and automatically-generated template features

Author: A Rzhetsky
AK Ramani
B Rosario
C Blaschke
Chi-Hsin Huang
F Sha
H-W Chun
Hong-Jie Dai
HW Chun
J Lafferty
J Nocedal
J Xiao
JN Darroch
K Becker
K Hirohata
M Bundschus
M Craven
M Masseroli
M Shimbo
N Kambhatla
P Ruch
Po-Ting Lai
R Bunescu
R Weissberg
RC Bunescu
Richard Tzong-Han Tsai
RT Tsai
RTK Lin
T Ono
T Rindflesch
TC Rindflesch
TF Smith
TH Tsai
Wen-Harn Pan
Wen-Lian Hsu
Y Yamamoto
Yen-Ching Chang
Yue-Yang Bow
Z GuoDong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The genetic factors leading to hypertension have been extensively studied, and large numbers of research papers have been published on the subject. One of hypertension researchers' primary research tasks is to locate key hypertension-related genes in abstracts. However, gathering such information with existing tools is not easy: (1) Searching for articles often returns far too many hits to browse through. (2) The search results do not highlight the hypertension-related genes discovered in the abstract. (3) Even though some text mining services mark up gene names in the abstract, the key genes investigated in a paper are still not distinguished from other genes. To facilitate the information gathering process for hypertension researchers, one solution would be to extract the key hypertension-related genes in each abstract. Three major tasks are involved in the construction of this system: (1) gene and hypertension named entity recognition, (2) section categorization, and (3) gene-hypertension relation extraction. Results We first compare the retrieval performance achieved by individually adding template features and position features to the baseline system. Then, the combination of both is examined. We found that using position features can almost double the original AUC score (0.8140vs.0.4936) of the baseline system. However, adding template features only results in marginal improvement (0.0197). Including both improves AUC to 0.8184, indicating that these two sets of features are complementary, and do not have overlapping effects. We then examine the performance in a different domain--diabetes, and the result shows a satisfactory AUC of 0.83. Conclusion Our approach successfully exploits template features to recognize true hypertension-related gene mentions and position features to distinguish key genes from other related genes. Templates are automatically generated and checked by biologists to minimize labor costs. Our approach integrates the advantages of machine learning models and pattern matching. To the best of our knowledge, this the first systematic study of extracting hypertension-related genes and the first attempt to create a hypertension-gene relation corpus based on the GAD database. Furthermore, our paper proposes and tests novel features for extracting key hypertension genes, such as relative position, section, and template features, which could also be applied to key-gene extraction for other diseases.</p

Directory of Open Access Journals

Public Library of Science (PLOS)

Gene-Disease Network Analysis Reveals Functional Modules in Mendelian, Complex and Environmental Diseases

Author: A Bauer-Mehren
A Bauer-Mehren
A Bauer-Mehren
A Fernández
A Hamosh
A López García De Lomana
A-L Barabasi
A-L Barabási
AD D'Andrea
AH Smith
AJ Enright
Anna Bauer-Mehren
C Gabor
C Margadant
C The UniProt
C-Y Yang
CJ Mattingly
CR Scriver
CT Butts
D Botstein
D-T Bau
DHEW Huberts
E Cerami
E Ravasz
F Scaldaferri
Ferran Sanz
G Lima-Mendez
HY Chiou
I Celik
J Freudenberg
J Lim
JA Kennedy
JA Mitchell
JN Hirschhorn
Jv Reeuwijk
K-I Goh
KM Dipple
L De Luca
LA Garraway
Laura I. Furlong
LH Hartwell
M Argos
M Bundschus
M Cokol
M Melkoniemi
M Oti
MA van Driel
MA Yildirim
Markus Bundschus
MEJ Newman
MG Kann
Michael Rautschka
Miguel A. Mayer
MJ Thun
MP Snead
N Przulj
NA Zaghloul
NN Ahmad
R Rubinstein
R Sankaranarayanan
R Sharan
Raya Khanin
RB Altman
RH Duerr
S Ananiadou
S Carreira
S Chavali
S Jones
S Park
S Suthram
S van Dongen
SA Navarro Silvera
SI Berger
T Tsuda
TE Klein
V Radosavljević
Y Li
Z Lu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Scientists have been trying to understand the molecular mechanisms of diseases to design preventive and therapeutic strategies for a long time. For some diseases, it has become evident that it is not enough to obtain a catalogue of the disease-related genes but to uncover how disruptions of molecular networks in the cell give rise to disease phenotypes. Moreover, with the unprecedented wealth of information available, even obtaining such catalogue is extremely difficult. We developed a comprehensive gene-disease association database by integrating associations from several sources that cover different biomedical aspects of diseases. In particular, we focus on the current knowledge of human genetic diseases including mendelian, complex and environmental diseases. To assess the concept of modularity of human diseases, we performed a systematic study of the emergent properties of human gene-disease networks by means of network topology and functional annotation analysis. The results indicate a highly shared genetic origin of human diseases and show that for most diseases, including mendelian, complex and environmental diseases, functional modules exist. Moreover, a core set of biological pathways is found to be associated with most human diseases. We obtained similar results when studying clusters of diseases, suggesting that related diseases might arise due to dysfunction of common biological processes in the cell. For the first time, we include mendelian, complex and environmental diseases in an integrated gene-disease association database and show that the concept of modularity applies for all of them. We furthermore provide a functional analysis of disease-related modules providing important new biological insights, which might not be discovered when considering each of the gene-disease association repositories independently. Hence, we present a suitable framework for the study of how genetic and environmental factors, such as drugs, contribute to diseases. The gene-disease networks used in this study and part of the analysis are available at http://ibi.imim.es/DisGeNET/DisGeNETweb.html#Download

Directory of Open Access Journals

Open Access LMU

UPF Digital Repository

Clustering cliques for graph-based summarization of the biomedical research literature

Author: A Naud
A Nenkova
A Ozgür
A Pons-Porrata
AR Aronson
AT McCray
AT McCray
Bartlomiej Wilkowski
C Wartena
Dongwook Shin
F Lerch
G Erkan
G Liu
GC Stein
H Kilicoglu
H Kilicoglu
H Yu
H Zhang
Han Zhang
I Mani
I Yoo
J Ah-Pine
J Goodwin
J Yang
JB Kruskal
K Sparck Jones
KW Boyack
L Smith
LH Reeve
LH Reeve
M Bundschus
M Fiszman
M Fiszman
M Kan
M Lee
Marcelo Fiszman
MG Everett
MJ Norusis
O Bodenreider
P Langfelder
P Tan
PJ Rousseeuw
R Mihalcea
SP Borgatti
T Matsunage
TC Rindflesch
TC Rindflesch
Thomas C Rindflesch
V an der Spek P Klusener S
V Batagelj
VD Blondel
X Liu
X Zhang
Y Yamamoto
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

BACKGROUND: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts). RESULTS: SemRep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm based on common arguments shared among cliques. The validity of the clusters in the summaries produced was compared to the Silhouette-generated baseline for cohesion, separation and overall validity. The theme labels were also compared to a reference standard produced with major MeSH headings. CONCLUSIONS: For 11 topics in the testing data set, the overall validity of clusters from the system summary was 10% better than the baseline (43% versus 33%). While compared to the reference standard from MeSH headings, the results for recall, precision and F-score were 0.64, 0.65, and 0.65 respectively

Online Research Database In Technology

Fusing literature and full network data improves disease similarity computation

Author: A Bauer-Mehren
A Bravo
A Chatr-Aryamontri
A Gottlieb
A Schlicker
AM Emokpae
AP Davis
B Demchak
C Ortutay
D Lin
DA Lindberg
DJ Clauw
DS Lee
F Kayhan
F Pedregosa
G Hu
HJ Lowe
I Lee
J Pinero
J Wang
JA Mitchell
Jingkai Yu
JM Stuart
JS Amberger
JZ Wang
KG Becker
KI Goh
L Cheng
L Cheng
M Ashburner
M Bettini
M Bundschus
M Jallouli
M Persico
M Raica
MH Coletti
P Li
P Resnik
Ping Li
PJ Heagerty
PN Robinson
R Hoehndorf
S Kohler
S Kohler
S Kohler
S Mathur
S Mathur
S Navlakha
S Orchard
S Pakhomov
S Rose
S Suthram
T Fawcett
T Groza
T Zemojtel
TS Keshava Prasad
WA Kibbe
X Zhang
Y Deng
Yaling Nie
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Knowledge-based extraction of adverse drug events from biomedical text

Author: A Airola
A Ozg
AM Cohen
AR Aronson
Bharat Singh
C Bizer
CF Thorn
Chinh Bui
D Demner-Fushman
D Ferrucci
D Hanisch
D Revere
DS Wishart
E Buyko
Erik M van Mulligen
F Leitner
F Rinaldi
F Rinaldi
GB Melton
H Gurulingappa
H Gurulingappa
H Gurulingappa
H Jang
HJ Dai
HW Chun
J Saric
J-D Kim
J-H Kim
Jan A Kors
JD Kim
K Fundel
KM Hettne
LJ Jensen
M Bundschus
M Huang
M Krallinger
M Krallinger
M Krallinger
MA Schwartz Hearst
MJ Schuemie
MS Simpson
N Kang
N Kang
N Kang
Ning Kang
O Bodenreider
O Bodenreider
O Uzuner
P Zweigenbaum
PL Elkin
QC Bui
R Islamaj Doğan
S Buchholz
S Kandula
S Katrenko
S Pyysalo
T Rindflesch
TC Rindflesch
TC Rindflesch
U Hahn
Y Huang
Y Kano
Y Tateisi
Zubair Afzal
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

A Hybrid approach for biomedical relation extraction using finite state automata and random forest-weighted fusion

Author: A Ben Abacha
A Ben Abacha
B Rink
C Friedman
CC Chang
D Liparas
D Steyrl
E Gokgoz
EE Tripoliti
J Li
L Breiman
M Bundschus
O Frunza
R Feldman
Ö Uzuner
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2017
Field of study

Comunicació presentada a: The 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2017), celebrada a Budapest, Hungria, del 17 al 23 d'abril de 2017.The automatic extraction of relations between medical entities found in related texts is considered to be a very important task, due to the multitude of applications that it can support, from question answering systems to the devel-opment of medical ontologies. Many different methodologies have been pre-sented and applied to this task over the years. Of particular interest are hybrid approaches, in which different techniques are combined in order to improve the individual performance of either one of them. In this study, we extend a previ-ously established hybrid framework for medical relation extraction, which we modify by enhancing the pattern-based part of the framework and by applying a more sophisticated weighting method. Most notably, we replace the use of regu-lar expressions with finite state automata for the pattern-building part, while the fusion part is replaced by a weighting strategy that is based on the operational capabilities of the Random Forests algorithm. The experimental results indicate the superiority of the proposed approach against the aforementioned well-established hybrid methodology and other state-of-the-art approaches.This work was supported by the project KRISTINA (H2020-645012), funded by the European Commission. Deidentified clinical records used in this research were provided by the i2b2 National Center for Biomedical Computing funded by U54LM008748 and were originally prepared for the Shared Tasks for Challenges in NLP for Clinical Data organized by Dr. Ozlem Uzuner, i2b2 and SUNY