Search CORE

39 research outputs found

Mining metabolites: extracting the yeast metabolome from the literature

Author: Chikashi Nobata
CR Batchelor
D Banville
D Broadhurst
D Jiao
DB Kell
Douglas B. Kell
GA Eller
J Brecher
J Finkel
J Townsend
J Wisniewski
J Wren
JD Kim
JD Kim
Jun’ichi Tsujii
K Degtyarenko
KM Hettne
L Goebels
M Hucka
M Kanehisa
M Kanehisa
M Kanehisa
M Krallinger
N Okazaki
P Corbett
P Mendes
Paul D. Dobson
PD Dobson
Pedro Mendes
R Hoffmann
R Klinger
S Ananiadou
S Ananiadou
S Ananiadou
Sophia Ananiadou
Syed A. Iqbal
X Wang
Y Kano
Y Kano
Y Miyao
Y Sasaki
Y Tsuruoka
Y Tsuruoka
Publication venue: Springer US
Publication date: 01/01/2011
Field of study

Text mining methods have added considerably to our capacity to extract biological knowledge from the literature. Recently the field of systems biology has begun to model and simulate metabolic networks, requiring knowledge of the set of molecules involved. While genomics and proteomics technologies are able to supply the macromolecular parts list, the metabolites are less easily assembled. Most metabolites are known and reported through the scientific literature, rather than through large-scale experimental surveys. Thus it is important to recover them from the literature. Here we present a novel tool to automatically identify metabolite names in the literature, and associate structures where possible, to define the reported yeast metabolome. With ten-fold cross validation on a manually annotated corpus, our recognition tool generates an f-score of 78.49 (precision of 83.02) and demonstrates greater suitability in identifying metabolite names than other existing recognition tools for general chemical molecules. The metabolite recognition tool has been applied to the literature covering an important model organism, the yeast Saccharomyces cerevisiae, to define its reported metabolome. By coupling to ChemSpider, a major chemical database, we have identified structures for much of the reported metabolome and, where structure identification fails, been able to suggest extensions to ChemSpider. Our manually annotated gold-standard data on 296 abstracts are available as supplementary materials. Metabolite names and, where appropriate, structures are also available as supplementary materials

Crossref

PubMed Central

The University of Manchester - Institutional Repository

The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*

Author: Aerts Jan
Aoki-Kinoshita Kiyoko F
Arakawa Kazuharu
Aranda Bruno
Asai Kiyoshi
Barboza Lord Hendrix
Bonnal Raoul JP
Bruskiewich Richard
Bryne Jan C
Chun Hong-Woo
Fernández José M
Funahashi Akira
Gordon Paul MK
Goto Naohisa
Groscurth Andreas
Gutteridge Alex
Holland Richard
Kano Yoshinobu
Katayama Toshiaki
Kawas Edward A
Kawashima Shuichi
Kerhornou Arnaud
Kibukawa Eri
Kinjo Akira R
Kuhn Michael
Lapp Hilmar
Lehvaslaiho Heikki
Nakamura Hiroyuki
Nakamura Yasukazu
Nakao Mitsuteru
Nishizawa Tatsuya
Nobata Chikashi
Noguchi Tamotsu
Oinn Thomas M
Okamoto Shinobu
Ono Keiichiro
Owen Stuart
Pafilis Evangelos
Pocock Matthew
Prins Pjotr
Ranzinger René
Reisinger Florian
Salwinski Lukasz
Schreiber Mark
Senger Martin
Shigemoto Yasumasa
Standley Daron M
Sugawara Hideaki
Takagi Toshihisa
Tashiro Toshiyuki
Trelles Oswaldo
Vos Rutger A
Wilkinson Mark D
Yamaguchi Atsuko
Yamamoto Yasunori
York William
Zmasek Christian M
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Wageningen University & Research Publications

eScholarship - University of California

CRL/NYU summarization system at DUC-2004

Author: Chikashi Nobata
Satoshi Sekine
Publication venue
Publication date: 01/01/2004
Field of study

We participated in two multi-document summarization tasks (Task 2 and Task 5) at the DUC-2004 formal run and evaluated the performance of our summarization system. Our system based on sentence extraction also uses a module to estimate similarity between sentences. The similarity information was used for either selecting the representative sentence among similar sentences or gathering key sentences that have similar structures but different contents. We also incorporated a module which categorized document sets into two groups corresponding to the distribution of key sentences.

CiteSeerX

The University of Manchester - Institutional Repository

Results of CRL/NYU System at DUC-2003 and an Experiment on Division of Document.

Author: Nobata Chikashi
Sekine Satoshi
Publication venue
Publication date: 01/06/2003
Field of study

The University of Manchester - Institutional Repository

Sentence Extraction with Information Extraction technique.

Author: Nobata Chikashi
Sekine Satoshi
Publication venue
Publication date: 01/09/2001
Field of study

The University of Manchester - Institutional Repository

Towards Automatic Acquisition of Patterns for Information Extraction.

Author: Nobata Chikashi
Sekine Satoshi
Publication venue
Publication date: 01/03/1999
Field of study

The University of Manchester - Institutional Repository

A Survey for Multi-document Summarization.

Author: Nobata Chikashi
Sekine Satoshi
Publication venue
Publication date: 01/01/2003
Field of study

Automatic Multi-Document summarization is still hard to realize. Under such circumstances, we believe, it is important to observe how humans are doing the same task, and look around for different strategies. We prepared 100 document sets similar to the ones used in the DUC multi-document summarization task. For each document set, several people prepared the following data and we conducted a survey. A) Free style summarization B) Sentence Extraction type summarization C) Axis (type of main topic) D) Table style summary In particular, we will describe the last two in detail, as these could lead to a new direction for multi-summarization research.

CiteSeerX

Crossref

The University of Manchester - Institutional Repository

Nigel Collier, Chikashi Nobata, and Jun'ichi Tsujii.

Author: Collier Nigel
Nobata Chikashi
Tsujii Jun'ichi
Publication venue
Publication date: 01/01/2001
Field of study

The University of Manchester - Institutional Repository