Search CORE

Leiden University Scholary Publications

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Novel Protein-Protein Interactions Inferred from Literature Context

We have developed a method that predicts Protein-Protein Interactions (PPIs) based on the similarity of the context in which proteins appear in literature. This method outperforms previously developed PPI prediction algorithms that rely on the conjunction of two protein names in MEDLINE abstracts. We show significant increases in coverage (76% versus 32%) and sensitivity (66% versus 41% at a specificity of 95%) for the prediction of PPIs currently archived in 6 PPI databases. A retrospective analysis shows that PPIs can efficiently be predicted before they enter PPI databases and before their interaction is explicitly described in the literature. The practical value of the method for discovery of novel PPIs is illustrated by the experimental confirmation of the inferred physical interaction between CAPN3 and PARVB, which was based on frequent co-occurrence of both proteins with concepts like Z-disc, dysferlin, and alpha-actinin. The relationships between proteins predicted by our method are broader than PPIs, and include proteins in the same complex or pathway. Dependent on the type of relationships deemed useful, the precision of our method can be as high as 90%. The full set of predicted interactions is available in a downloadable matrix and through the webtool Nermal, which lists the most likely interaction partners for a given protein. Our framework can be used for prioritizing potential interaction partners, hitherto undiscovered, for follow-up studies and to aid the generation of accurate protein interaction maps

Erasmus University Digital Repository

EUR Research Repository

Text-derived concept profiles support assessment of DNA microarray data for acute myeloid leukemia and for androgen receptor stimulation

BACKGROUND: High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of genes and other biomedical concepts from free-text literature. For each gene, a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature. We assign a weight to each concept in the profile based on a likelihood ratio measure. Gene concept profiles can then be clustered to find related genes and other concepts. RESULTS: The experimental validation was done in two steps. We first applied our method on a controlled test set. After this proved to be successful the datasets from two DNA microarray experiments were analyzed in the same way and the results were evaluated by domain experts. The first dataset was a gene-expression profile that characterizes the cancer cells of a group of acute myeloid leukemia patients. For this group of patients the biological background of the cancer cells is largely unknown. Using our methodology we found an association of these cells to monocytes, which agreed with other experimental evidence. The second data set consisted of differentially expressed genes following androgen receptor stimulation in a prostate cancer cell line. Based on the analysis we put forward a hypothesis about the biological processes induced in these studied cells: secretory lysosomes are involved in the production of prostatic fluid and their development and/or secretion are androgen-regulated processes. CONCLUSION: Our method can be used to analyze DNA microarray datasets based on information explicitly and implicitly available in the literature. We provide a publicly available tool, dubbed Anni, for this purpose

Maastricht University Research Portal

Springer - Publisher Connector

Erasmus University Digital Repository

EUR Research Repository

Text Mining for Literature Review and Knowledge Discovery in Cancer Risk Assessment and Research

Author: A Keselman
A Kolman
A Korhonen
Anna Korhonen
AR Feinstein
B Alex
C Boström
C Cortes
C Leslie
CC Chang
D Hattis
D McGregor
D Ó Séaghdha
Diarmuid Ó Séaghdha
DV Cicchetti
H Wang
H Wang
Ilona Silins
J Cohen
J Lin
J Shawe-Taylor
Johan Högberg
K Bouker
K Morgan
KB Cohen
L Hunter
Lin Sun
M Hein
M Jackson
N Cristianini
N Karamanis
Neil R. Smalheiser
P Zweigenbaum
Products EFSA Panel on Plant Protection
R Frijters
R Jelier
R Judson
RB Altman
S Ananiadou
S Cohen
Science US National Academy of
T Byrt
T Joachims
TC Rindesch
TG Dietterich
Ulla Stenius
Y Guo
YW Chen
Publication venue: Public Library of Science
Publication date: 12/04/2012
Field of study

Research in biomedical text mining is starting to produce technology which can make information in biomedical literature more accessible for bio-scientists. One of the current challenges is to integrate and refine this technology to support real-life scientific tasks in biomedicine, and to evaluate its usefulness in the context of such tasks. We describe CRAB – a fully integrated text mining tool designed to support chemical health risk assessment. This task is complex and time-consuming, requiring a thorough review of existing scientific data on a particular chemical. Covering human, animal, cellular and other mechanistic data from various fields of biomedicine, this is highly varied and therefore difficult to harvest from literature databases via manual means. Our tool automates the process by extracting relevant scientific data in published literature and classifying it according to multiple qualitative dimensions. Developed in close collaboration with risk assessors, the tool allows navigating the classified dataset in various ways and sharing the data with other users. We present a direct and user-based evaluation which shows that the technology integrated in the tool is highly accurate, and report a number of case studies which demonstrate how the tool can be used to support scientific discovery in cancer risk assessment and research. Our work demonstrates the usefulness of a text mining pipeline in facilitating complex research tasks in biomedicine. We discuss further development and application of our technology to other types of chemical risk assessment in the future

FigShare

Integrated Genome-Scale Prediction of Detrimental Mutations in Transcription Networks

Author: A Tanay
AM Moses
AP Gasch
B Prud'homme
Ben Lehner
C Zhu
C-S Chin
CA Brown
CS Chan
CT Harbison
D Schmidt
DM Gelperin
DT Odom
E Segal
ET Dermitzakis
G Giaever
G Liti
GD Stormo
HC Mak
I Lee
I Tirosh
I Tirosh
J Gagneur
J Gerke
J Gertz
J Ihmels
J Kim
J Zheng
J Zhu
JD Lieb
JH McDonald
JI Semple
Joshua M. Akey
K Chen
KD MacIsaac
L Giorgetti
L Peña-Castillo
L Teytelman
LA Boyer
LA Hindorff
M Dreze
M Kellis
MC King
Mirko Francesconi
NN Batada
Q Zhong
R Johnson
R Jothi
R Sopko
Rob Jelier
S MacArthur
S Marcand
S Ohno
S Zeiser
SB Carroll
SW Doniger
SW Doniger
T Vavouri
T Vavouri
U Nagalakshmi
V Mustonen
X yong Li
Y Bilu
Y Field
Z Ouyang
Z Wunderlich
Publication venue: Public Library of Science
Publication date: 01/05/2011
Field of study

A central challenge in genetics is to understand when and why mutations alter the phenotype of an organism. The consequences of gene inhibition have been systematically studied and can be predicted reasonably well across a genome. However, many sequence variants important for disease and evolution may alter gene regulation rather than gene function. The consequences of altering a regulatory interaction (or “edge”) rather than a gene (or “node”) in a network have not been as extensively studied. Here we use an integrative analysis and evolutionary conservation to identify features that predict when the loss of a regulatory interaction is detrimental in the extensively mapped transcription network of budding yeast. Properties such as the strength of an interaction, location and context in a promoter, regulator and target gene importance, and the potential for compensation (redundancy) associate to some extent with interaction importance. Combined, however, these features predict quite well whether the loss of a regulatory interaction is detrimental across many promoters and for many different transcription factors. Thus, despite the potential for regulatory diversity, common principles can be used to understand and predict when changes in regulation are most harmful to an organism

Lirias

Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

Author: A Gomez-Perez
Andrew P Gibson
B Smith
C Goble
CA Goble
CD Manning
CJ Mungall
DA Moreira
DL Rubin
E Neumann
Edgar Meij
EJ Meij
G Antoniou
I Spasic
IH Witten
J Broekstra
JA Kors
Konstantinos Krommydas
LD Stein
LJ Post
M Ashburner
M Missikoff
M Scott Marshall
M Weeber
MA Inda
Marco Roos
Martijn Schuemie
O Tuason
P Fisher
P Missier
P Romano
Pieter W Adriaans
PJ Verschure
R Hoehndorf
R Jelier
R Stevens
R Witte
S Jupp
S Katrenko
S Katrenko
S Katrenko
Sophia Katrenko
T Clark
Willem Robert van Hage
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results: We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion: We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation

Springer - Publisher Connector

VU Research Portal

International Migration, Integration and Social Cohesion online publications

EUR Research Repository

UvA-DARE

Proteomic Analysis of the Dysferlin Protein Complex Unveils Its Importance for Sarcolemmal Maintenance and Integrity

Author: A Impagliazzo
A Lorusso
AK McNeil
André M. Deelder
Antoine de Morrée
B Borgonovo
BA Azakir
BN Ampong
C Cai
C Cai
C Collinet
C Matsuda
C Matsuda
C Matsuda
C Therrien
D Bansal
DB Davis
DJ Hernandez-Deviez
DJ Hernandez-Deviez
E Cocucci
E Fujita
FC Luft
H Izzedine
Herman H. H. B. M. van Haagen
I Illa
I Roux
Irina Dragan
J Liu
JA Roche
JE Morgan
JF Covian-Nares
JS Beckmann
K Bushby
K Nagaraju
K Wenzel
KR Doherty
KS Rutgers
L Klinge
LO Martinez
LV Anderson
M Ho
Mel B. Feany
MH Disatnik
N De Luna
N De Luna
N Wein
NJ Lennon
NL Quach
NL Quach
NL Washington
P Krajacic
P Munoz
P Patel
Paul J. Hensbergen
Peter A. C. ’t Hoen
R Bashir
R Jelier
RL Mellgren
RL Mellgren
RM Foxton
Rune R. Frants
S Yamaji
S Yasunaga
S Yasunaga
Silvère M. van der Maarel
TM Olson
Y Huang
Y Huang
Y Huang
Y Ishihama
YH Chiu
YK Hayashi
ZA Pramono
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Dysferlin is critical for repair of muscle membranes after damage. Mutations in dysferlin lead to a progressive muscular dystrophy. Recent studies suggest additional roles for dysferlin. We set out to study dysferlin's protein-protein interactions to obtain comprehensive knowledge of dysferlin functionalities in a myogenic context. We developed a robust and reproducible method to isolate dysferlin protein complexes from cells and tissue. We analyzed the composition of these complexes in cultured myoblasts, myotubes and skeletal muscle tissue by mass spectrometry and subsequently inferred potential protein functions through bioinformatics analyses. Our data confirm previously reported interactions and support a function for dysferlin as a vesicle trafficking protein. In addition novel potential functionalities were uncovered, including phagocytosis and focal adhesion. Our data reveal that the dysferlin protein complex has a dynamic composition as a function of myogenic differentiation. We provide additional experimental evidence and show dysferlin localization to, and interaction with the focal adhesion protein vinculin at the sarcolemma. Finally, our studies reveal evidence for cross-talk between dysferlin and its protein family member myoferlin. Together our analyses show that dysferlin is not only a membrane repair protein but also important for muscle membrane maintenance and integrity

CiteSeerX

Leiden University Scholary Publications

Biomedical Text Mining and Its Applications

Calpain 3 Is a Rapid-Action, Unidirectional Proteolytic Switch Central to Muscle Remodeling

Calpain 3 (CAPN3) is a cysteine protease that when mutated causes Limb Girdle Muscular Dystrophy 2A. It is thereby the only described Calpain family member that genetically causes a disease. Due to its inherent instability little is known of its substrates or its mechanism of activity and pathogenicity. In this investigation we define a primary sequence motif underlying CAPN3 substrate cleavage. This motif can transform non-related proteins into substrates, and identifies >300 new putative CAPN3 targets. Bioinformatic analyses of these targets demonstrate a critical role in muscle cytoskeletal remodeling and identify novel CAPN3 functions. Among the new CAPN3 substrates are three E3 SUMO ligases of the Protein Inhibitor of Activated Stats (PIAS) family. CAPN3 can cleave PIAS proteins and negatively regulates PIAS3 sumoylase activity. Consequently, SUMO2 is deregulated in patient muscle tissue. Our study thus uncovers unexpected crosstalk between CAPN3 proteolysis and protein sumoylation, with strong implications for muscle remodeling

CiteSeerX

Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases

Author: AA Morgan
AC Nicholson
AJ Perez
Andrey Rzhetsky
AP Weetman
B Dell'Osso
B Rapoport
B Vaidya
BA Imhof
BT Alako
C Blaschke
C Nielsen
C Puozzo
CJ McDougle
CR Faltynek
D Chaussabel
D Denys
D Hristovski
D Olive
D Shao
DB Kell
DR Swanson
DR Swanson
E Yung
EC Butcher
EC Butcher
GR Hajer
H Kakeya
H Shatkay
HP Fischer
I Kola
J Han
J Kuhlmann
JA Wagner
Jacob de Vlieg
JD Wren
JD Wren
K Kajinami
K Miguita
K Njung'e
K Tomiyama
K Vandenborre
L Prokunina
LJ Jensen
M Briley
M Briley
M Campillos
M Hayashi
M Imoto
M Inazu
M Kamata
M Sugiyama
M Yetisgen-Yildiz
MA Andrade
MA Andrade
Marianne van Vugt
N Daraselia
NR Smalheiser
PD Pelton
PR Newby
R Frijters
R Frijters
R Frijters
R Homayouni
R Jelier
RA DiGiacomo
Raoul Frijters
René van Schaik
Ruben Smeets
RY Mukhtar
S Gordon
S Morikawa
S Raychaudhuri
S Raychaudhuri
SN Vaishnavi
SS Fuller
T Fawcett
T Hiramatsu
T Ito
T Shokawa
T Tabata
TK Jenssen
TT Ashburn
U Kaneyuki
WA Colburn
WK Goodman
Wynand Alkema
Y Ichimaru
Y Sugimoto
Y Tamori
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs