Search CORE

18 research outputs found

Between proteins and phenotypes: annotation and interpretation of mutations

Author: Anna Bauer-Mehren
Anshu Bhardwaj
B Smith
Christopher JO Baker
CJO Baker
CJO Baker
D Rebholz-Schuhmann
Dietrich Rebholz-Schuhmann
Joke Reumers
Jose MG Izarzugaza
K Becker
Kevin Nagel
LT Sam
Martin Krallinger
Omar Haq
PN Robinson
R Kanagasabai
Rainer Winnenburg
Süveyda Yeniterzi
Y Bromberg
Yana Bromberg
Publication venue: BioMed Central
Publication date: 27/08/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

EnzyMiner: automatic identification of protein level mutations and their impact on target enzymes from PubMed abstracts

Author: A Bairoch
A Chang
A Fleischmann
AK McCallum
C Cleverdon
CJO Baker
CJO Baker
CT Porter
D Hanisch
D Rebholz-Schuhmann
F Horn
F Sebastiani
G Szarvas
GL Holliday
J Barthelmes
JG Caporaso
JT Chang
K Hult
K Rajaraman
LC Lee
LS Larkey
M Erdogmus
N Gövert
N Nagano
O Zamir
R Caspi
R Witte
R Witte
R Witte
RN Goldberg
SK Dwivedi
Süveyda Yeniterzi
T Karopka
Uğur Sezerman
V Renugopalakrishnan
Y Tsuruoka
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

BACKGROUND: A better understanding of the mechanisms of an enzyme's functionality and stability, as well as knowledge and impact of mutations is crucial for researchers working with enzymes. Though, several of the enzymes' databases are currently available, scientific literature still remains at large for up-to-date source of learning the effects of a mutation on an enzyme. However, going through vast amounts of scientific documents to extract the information on desired mutation has always been a time consuming process. In this paper, therefore, we describe an unique method, termed as EnzyMiner, which automatically identifies the PubMed abstracts that contain information on the impact of a protein level mutation on the stability and/or the activity of a given enzyme. RESULTS: We present an automated system which identifies the abstracts that contain an amino-acid-level mutation and then classifies them according to the mutation's effect on the enzyme. In the case of mutation identification, MuGeX, an automated mutation-gene extraction system has an accuracy of 93.1% with a 91.5 F-measure. For impact analysis, document classification is performed to identify the abstracts that contain a change in enzyme's stability or activity resulting from the mutation. The system was trained on lipases and tested on amylases with an accuracy of 85%. CONCLUSION: EnzyMiner identifies the abstracts that contain a protein mutation for a given enzyme and checks whether the abstract is related to a disease with the help of information extraction and machine learning techniques. For disease related abstracts, the mutation list and direct links to the abstracts are retrieved from the system and displayed on the Web. For those abstracts that are related to non-diseases, in addition to having the mutation list, the abstracts are also categorized into two groups. These two groups determine whether the mutation has an effect on the enzyme's stability or functionality followed by displaying these on the web

Crossref

Springer - Publisher Connector

PubMed Central

Sabanci University Research Database

Annotation of protein residues based on a literature analysis: cross-validation against UniProtKb

Author: A Stark
Antonio Jimeno-Yepes
BJ Polacco
BJ Stapley
C Blaschke
C Blaschke
C Friedman
CH Wu
CJO Baker
CJO Baker
D Bourigault
D Rebholz-Schuhmann
D Rebholz-Schuhmann
Dietrich Rebholz-Schuhmann
DL Wheeler
DM Kristensen
EM Marcotte
F Cerbah
F Guenthner
F Horn
G Leroy
JA Barker
JC Nebel
Kevin Nagel
LC Lee
M Ikeda
MM Babu
P Pezik
R Kanagasabai
R Witte
S Gaudan
S Yoon
TJ Oldfield
Y Miyao
Y Tateisi
Y Tsuruoka
YL Yip
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background A protein annotation database, such as the Universal Protein Resource knowledge base (UniProtKb), is a valuable resource for the validation and interpretation of predicted 3D structure patterns in proteins. Existing studies have focussed on point mutation extraction methods from biomedical literature which can be used to support the time consuming work of manual database curation. However, these methods were limited to point mutation extraction and do not extract features for the annotation of proteins at the residue level. Results This work introduces a system that identifies protein residues in MEDLINE abstracts and annotates them with features extracted from the context written in the surrounding text. MEDLINE abstract texts have been processed to identify protein mentions in combination with taxonomic species and protein residues (F1-measure 0.52). The identified protein-species-residue triplets have been validated and benchmarked against reference data resources (UniProtKb, average F1-measure of 0.54). Then, contextual features were extracted through shallow and deep parsing and the features have been classified into predefined categories (F1-measure ranges from 0.15 to 0.67). Furthermore, the feature sets have been aligned with annotation types in UniProtKb to assess the relevance of the annotations for ongoing curation projects. Altogether, the annotations have been assessed automatically and manually against reference data resources. Conclusion This work proposes a solution for the automatic extraction of functional annotation for protein residues from biomedical articles. The presented approach is an extension to other existing systems in that a wider range of residue entities are considered and that features of residues are extracted as annotations.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Bioinformatics research in the Asia Pacific: a 2007 update

Author: A Madhumalar
BC Kim
C Wang
CJO Baker
D Gilbert
DT Singh
GL Zhang
H Sugawara
H Zhao
KH Choo
L Kong
M Ganapathiraju
Michael Gribskov
N Yanamala
O Miotto
O Miotto
PD Yoo
Q Xu
R Ördög
RTH Tsai
S Dastmalchi
S Miyano
S Ranganathan
S Ranganathan
S Ranganathan
SH Chen
SH Nagaraj
Shoba Ranganathan
Tin Wee Tan
U Sangket
V Chelliah
WY Kim
YP Lim
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27–30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Purdue E-Pubs

Macquarie University ResearchOnline

ScholarBank@NUS

Ontology Design Patterns for bio-ontologies: a case study on the Cell Cycle Ontology

Author: A Gangemi
A Gangemi
A Rector
A Rector
AL Rector
AL Rector
AL Rector
B Alberts
B Smith
B Smith
C Golbreich
C Wroe
CJ Wroe
CJO Baker
D Tsarkov
D Wheeler
DA Moreira
E Antezana
E Gamma
E Sirin
Erick Antezana
FW Hartel
Gene Ontology Consortium
J Aitken
J Rogers
J Seidenberg
M Aranguren
M Horridge
Martin Kuiper
Mikel Egaña Aranguren
O Bodenreider
P Burek
P Clark
R Stevens
Robert Stevens
S Brockmans
S Kerrien
S Krivov
S Schulz
S Schulz
S Shultz
S Staab
Uniprot Consortium
V Svatek
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Bio-ontologies are key elements of knowledge management in bioinformatics. Rich and rigorous bio-ontologies should represent biological knowledge with high fidelity and robustness. The richness in bio-ontologies is a prior condition for diverse and efficient reasoning, and hence querying and hypothesis validation. Rigour allows a more consistent maintenance. Modelling such bio-ontologies is, however, a difficult task for bio-ontologists, because the necessary richness and rigour is difficult to achieve without extensive training. Results Analogous to design patterns in software engineering, Ontology Design Patterns are solutions to typical modelling problems that bio-ontologists can use when building bio-ontologies. They offer a means of creating rich and rigorous bio-ontologies with reduced effort. The concept of Ontology Design Patterns is described and documentation and application methodologies for Ontology Design Patterns are presented. Some real-world use cases of Ontology Design Patterns are provided and tested in the Cell Cycle Ontology. Ontology Design Patterns, including those tested in the Cell Cycle Ontology, can be explored in the Ontology Design Patterns public catalogue that has been created based on the documentation system presented (<url>http://odps.sourceforge.net/</url>). Conclusions Ontology Design Patterns provide a method for rich and rigorous modelling in bio-ontologies. They also offer advantages at different development levels (such as design, implementation and communication) enabling, if used, a more modular, well-founded and richer representation of the biological knowledge. This representation will produce a more efficient knowledge management in the long term.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Ghent University Academic Bibliography

PubMed Central

The University of Manchester - Institutional Repository

Extraction of human kinase mutations from literature, databases and genotyping studies

Author: A Baudot
AA Morgan
Alfonso Valencia
AW Burgess
C Greenman
C Ortutay
Carlos Rodriguez-Penagos
CJ Richardson
CJO Baker
D Rebholz-Schuhmann
D Santamaría
F Horn
G Manning
HM Berman
I Shchemelinin
IYS Tam
J Hurst
J Ptacek
JA Ubersax
JG Caporaso
JG Caporaso
Jose MG Izarzugaza
JT den Dunnen
LC Lee
LD Wood
LH Greene
LI Furlong
M Erdogmus
M Huse
M Lesk
Martin Krallinger
P Sanz
R Kanagasabai
R McDonald
R Witte
RD Finn
RE Saunders
RT McDonald
S Bamford
S Yamada
SF Altschul
T Joachims
T Sjöblom
YL Yip
YL Yip
YL Yip
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Phylogenetic Analysis of Pelecaniformes (Aves) Based on Osteological Data: Implications for Waterbird Phylogeny and Fossil Calibration Studies

Author: A Cooper
A Feduccia
A Louchart
A Manegold
A Milne-Edwards
A Myrcha
AB Smith
AB Smith
AC Chandler
AC Milner
AR Templeton
AR Templeton
AV Panteleyev
BC Livezey
BC Livezey
BR Holland
C Mourer-Chauviré
CG Sibley
CJO Harrison
CJO Harrison
CJO Harrison
CJO Harrison
CP Tambussi
CP Tambussi
CW Cunningham
D Siegel-Causey
DL Swofford
DL Swofford
DM Hillis
DR Mertz
DR Prothero
DT Ksepka
DT Rasmussen
E Bourdon
E Bourdon
E Bourdon
EI Saiff
ELR Simons
EM Prager
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
G Mayr
GB Nunn
GF van Tets
GF van Tets
GF van Tets
GF van Tets
GJ Dyke
H Howard
H Howard
H Howard
H Kishino
H Shimodaira
H Shimodaira
J Cracraft
J Cracraft
J Felsenstein
J Felsenstein
J Felsenstein
J Gatesy
J Gauthier
J Hughes
J Mlíkovsky
J Müller
JA Clarke
JA Coddington
JB Nelson
JF Parham
JJ Baumel
JJ Becker
JJ Bull
JJ Wiens
JJ Wiens
JL Goedert
JS Farris
JW Brown
K de Queiroz
K de Queiroz
K Dolphin
K Lambrecht
K Lambrecht
K Lambrecht
KE Omland
KE Slack
KG McCracken
KI Warheit
KI Warheit
KI Warheit
M Dowton
M Fürbringer
M Kennedy
M Kennedy
M Kennedy
M Kennedy
M Kennedy
M van Tuinen
M van Tuinen
MD Sorenson
ME Smith
MG Fain
ML Brewer
MSY Lee
N Goldman
Nathan D. Smith
ND Smith
OT Owre
P Darlu
P Jadwiszczak
PA Cottam
PGP Ericson
PH Harvey
R Lydekker
RH Baker
RL Zusi
Robert DeSalle
RW Sadlier
RW Storer
S Bertelli
S Chatterjee
S Hinic-Frlog
SB Hedges
Sharpe
SJ Hackett
SL Olson
SL Olson
SL Olson
SL Olson
SL Olson
SL Olson
SL Olson
SL Olson
SL Olson
TA Stidham
TR Buckley
VL De Pietri
VL Friesen
WA Berggren
Publication venue: Public Library of Science
Publication date: 01/10/2010
Field of study

) were also assessed. The antiquity of these taxa and their purported status as stem members of extant families makes them valuable for studies of higher-level avian diversification. (sister taxon to Phalacrocoracidae). These relationships are invariant when ‘backbone’ constraints based on recent avian phylogenies are imposed.Relationships of extant pelecaniforms inferred from morphology are more congruent with molecular phylogenies than previously assumed, though notable conflicts remain. The phylogenetic position of the Plotopteridae implies that wing-propelled diving evolved independently in plotopterids and penguins, representing a remarkable case of convergent evolution. Despite robust support for the placement of fossil taxa representing key calibration points, the successive outgroup relationships of several “stem fossil + crown family” clades are variable and poorly supported across recent studies of avian phylogeny. Thus, the impact these fossils have on inferred patterns of temporal diversification depends heavily on the resolution of deep nodes in avian phylogeny

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Text Mining Improves Prediction of Protein Functional Sites

Author: A Koussounadis
A Sokolov
AG Murzin
AR Atilgan
AT Laurie
BJ Grant
CA Earhart
CB Ahlers
CJO Baker
CM Nunn
CT Porter
D Ferrucci
D Ming
D Ming
D Ming
D Ming
D Ming
D Oliver
D Zhou
DS Greer
F Horn
F Leitner
GL Card
HJ Nam
HM Berman
I Bahar
J Dundas
J Laurila
J Ory
JD Cohn
JG Caporaso
JG Caporaso
JK Hurley
JM Jez
Judith D. Cohn
JY Choe
K Hinsen
K Nagel
K Nagel
K Nagel
K Verspoor
K Verspoor
K Verspoor
K Verspoor
Karin M. Verspoor
KB Cohen
KE Ravikumar
KL Damm
Komandur E. Ravikumar
L Hu
L Xie
LH Weaver
LJ Jensen
LL Huang
M Ankerst
M Krallinger
M Krallinger
ME Wall
MF Sanner
Michael E. Wall
ML Benson
ML Benson
MM Tirion
N Chim
Neil R. Smalheiser
PE Bourne
R Gaizauskas
R Witte
RC Edgar
S Perot
TW Schwartz
WA Baumgartner Jr
Publication venue: Public Library of Science
Publication date: 29/02/2012
Field of study

We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Literature mining of protein-residue associations with graph rules learned through distant supervision

Author: AA Morgan
CJO Baker
CJO Baker
D Ming
D Rebholz Schuhmann
E Buyko
E Buyko
F Horn
F Rinaldi
H Kilicoglu
H Liu
H Liu
H Liu
HM Berman
J Bjorne
J-D Kim
J-D Kim
JD Cohn
JD Kim
JG Caporaso
JG Caporaso
K Nagel
K Verspoor
KB Cohen
KM Verspoor
LC Lee
M Craven
M-C de Marneffe
MC De Marneffe
MC De Marneffe
P Thomas
P Thomas
PV Ogren
R Apweiler
R Hoffmann
R Witte
TVT Nguyen
WA Baumgartner
Y Miyao
Y Tsuruoka
Publication venue: BMC
Publication date: 01/10/2012
Field of study

Abstract Background We propose a method for automatic extraction of protein-specific residue mentions from the biomedical literature. The method searches text for mentions of amino acids at specific sequence positions and attempts to correctly associate each mention with a protein also named in the text. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic patterns corresponding to protein-residue pairs mentioned in the text. We finally present an approach to automated construction of relevant training and test data using the distant supervision model. Results The performance of the method was assessed by extracting protein-residue relations from a new automatically generated test set of sentences containing high confidence examples found using distant supervision. It achieved a F-measure of 0.84 on automatically created silver corpus and 0.79 on a manually annotated gold data set for this task, outperforming previous methods. Conclusions The primary contributions of this work are to (1) demonstrate the effectiveness of distant supervision for automatic creation of training data for protein-residue relation extraction, substantially reducing the effort and time involved in manual annotation of a data set and (2) show that the graph-based relation extraction approach we used generalizes well to the problem of protein-residue association extraction. This work paves the way towards effective extraction of protein functional residues from the literature.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

University of Melbourne Institutional Repository

Heterologous laccase production and its role in industrial applications

Author: Aehle W
Aehle W
Alcalde M
Baker CJO
Balland V
Berka R
Ferrer-Miralles N
Gelo-Pujic M
Joo SS
Kim D
Kojima Y
Koschorreck K
Kunamneni A
Laborde J
Li JF
Miele A
Molina-Guijarro JM
Soden DM
Vinod S
Williamson PR
Xu F
Yang JQ
Yaver DS
Yaver DS
Zhang YB
Zhang YB
Publication venue: Landes Bioscience
Publication date: 01/01/2010
Field of study

Laccases are blue multicopper oxidases, catalyzing the oxidation of an array of aromatic substrates concomitantly with the reduction of molecular oxygen to water. These enzymes are implicated in a variety of biological activities. Most of the laccases studied thus far are of fungal origin. The large range of substrates oxidized by laccases has raised interest in using them within different industrial fields, such as pulp delignification, textile dye bleaching and bioremediation. Laccases secreted from native sources are usually not suitable for large-scale purposes, mainly due to low production yields and high cost of preparation/purification procedures. Heterologous expression may provide higher enzyme yields and may permit to produce laccases with desired properties (such as different substrate specificities, or improved stabilities) for industrial applications. This review surveys researches on heterologous laccase expression focusing on the pivotal role played by recombinant systems towards the development of robust tools for greening modern industry

Archivio della ricerca - Università degli studi di Napoli Federico II

Crossref

PubMed Central