Search CORE

The chicken gene nomenclature committee report

Author: Antin Parker B
Burgess Shane C
Burt David W
Carrë Wilfrid
Fell Mark
Law Andy S
Maglott Donna R
McCarthy Fiona M
Schmidt Carl J
Weber Janet A
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Comparative genomics is an essential component of the post-genomic era. The chicken genome is the first avian genome to be sequenced and it will serve as a model for other avian species. Moreover, due to its unique evolutionary niche, the chicken genome can be used to understand evolution of functional elements and gene regulation in mammalian species. However comparative biology both within avian species and within amniotes is hampered due to the difficulty of recognising functional orthologs. This problem is compounded as different databases and sequence repositories proliferate and the names they assign to functional elements proliferate along with them. Currently, genes can be published under more than one name and one name sometimes refers to unrelated genes. Standardized gene nomenclature is necessary to facilitate communication between scientists and genomic resources. Moreover, it is important that this nomenclature be based on existing nomenclature efforts where possible to truly facilitate studies between different species. We report here the formation of the Chicken Gene Nomenclature Committee (CGNC), an international and centralized effort to provide standardized nomenclature for chicken genes. The CGNC works in conjunction with public resources such as NCBI and Ensembl and in consultation with existing nomenclature committees for human and mouse. The CGNC will develop standardized nomenclature in consultation with the research community and relies on the support of the research community to ensure that the nomenclature facilitates comparative and genomic studies

Springer - Publisher Connector

Edinburgh Research Explorer

The University of Arizona

University of Queensland eSpace

AS-ALPS: a database for analyzing the effects of alternative splicing on protein structure, interaction and network in human and mouse

Author: A. Yamaguchi
Altschul
Berman
Dreumont
Gough
Imanishi
K. Shinoda
K.-i. Takahashi
M. Go
M. Shionyu
Maglott
Mulder
Nagy
Noguti
Pearson
Stetefeld
Yamaguchi
Yura
Publication venue: Oxford University Press
Publication date
Field of study

We have constructed a database, AS-ALPS (alternative splicing-induced alteration of protein structure), which provides information that would be useful for analyzing the effects of alternative splicing (AS) on protein structure, interactions with other bio-molecules and protein interaction networks in human and mouse. Several AS events have been revealed to contribute to the diversification of protein structure, which results in diversification of interaction partners or affinities, which in turn contributes to regulation of bio-molecular networks. Most AS variants, however, are only known at the sequence level. It is important to determine the effects of AS on protein structure and interaction, and to provide candidates for experimental targets that are relevant to network regulation by AS. For this purpose, the three-dimensional (3D) structures of proteins are valuable sources of information; however, these have not been fully exploited in any other AS-related databases. AS-ALPS is the only AS-related database that describes the spatial relationships between protein regions altered by AS (‘AS regions’) and both the proteins’ hydrophobic cores and sites of inter-molecular interactions. This information makes it possible to infer whether protein structural stability and/or protein interaction are affected by each AS event. AS-ALPS can be freely accessed at http://as-alps.nagahama-i-bio.ac.jp and http://genomenetwork.nig.ac.jp/as-alps/

OmicBrowse: a Flash-based high-performance graphics interface for genomic resources

Author: A. Matsushima
Harris
Hubbard
Kawai
Kurihara
M. Ishii
Maglott
Masuya
N. Kobayashi
Ohyanagi
Povey
R. Umetsu
Rhee
S. Kawaguchi
Sakurai
Stein
T. A. Endo
T. Toyoda
Toyoda
Y. Makita
Y. Mochizuki
Publication venue: Oxford University Press
Publication date
Field of study

OmicBrowse is a genome browser designed as a scalable system for maintaining numerous genome annotation datasets. It is an open source tool capable of regulating multiple user data access to each dataset to allow multiple users to have their own integrative view of both their unpublished and published datasets, so that the maintenance costs related to supplying each collaborator exclusively with their own private data are significantly reduced. OmicBrowse supports DAS1 imports and exports of annotations to Internet site servers worldwide. We also provide a data-download named OmicDownload server that interactively selects datasets and filters the data on the selected datasets. Our OmicBrowse server has been freely available at http://omicspace.riken.jp/ since its launch in 2003. The OmicBrowse source code is downloadable from http://sourceforge.net/projects/omicbrowse/

MAGIA, a web-based tool for miRNA and Genes Integrated Analysis

Author: A. Bisognin
A. Coppe
Bagga
Bashir
Brown
C. Romualdi
Calin
Callis
Chien
Enright
Flynt
Fulci
G. Sales
Garzon
Hahne
Hrstka
Huang
Kertesz
Kraskov
Krek
Lewis
Lim
M. Biasiolo
Maglott
Moreau
Peifer
Plikus
Qiang
Rebholz-Schuhmann
REHMSMEIER
S. Bortoluzzi
Steuer
Wu
Zhao
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

MAGIA (miRNA and genes integrated analysis) is a novel web tool for the integrative analysis of target predictions, miRNA and gene expression data. MAGIA is divided into two parts: the query section allows the user to retrieve and browse updated miRNA target predictions computed with a number of different algorithms (PITA, miRanda and Target Scan) and Boolean combinations thereof. The analysis section comprises a multistep procedure for (i) direct integration through different functional measures (parametric and non-parametric correlation indexes, a variational Bayesian model, mutual information and a meta-analysis approach based on P-value combination) of mRNA and miRNA expression data, (ii) construction of bipartite regulatory network of the best miRNA and mRNA putative interactions and (iii) retrieval of information available in several public databases of genes, miRNAs and diseases and via scientific literature text-mining. MAGIA is freely available for Academic users at http://gencomp.bio.unipd.it/magia

Archivio istituzionale della ricerca - Università di Padova

The strength of co-authorship in gene name disambiguation

Author: A Morgan
AL Barabasi
AS Yeh
B Schijvenaars
D Hanisch
DR Maglott
G Savova
H Liu
H Xu
H Xu
H Xu
IH Witten
J Hakenberg
JR Quinlan
L Chen
L Hirschman
M Weeber
Richárd Farkas
RM Podowski
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background A biomedical entity mention in articles and other free texts is often ambiguous. For example, 13% of the gene names (aliases) might refer to more than one gene. The task of Gene Symbol Disambiguation (GSD) – a special case of Word Sense Disambiguation (WSD) – is to assign a unique gene identifier for all identified gene name aliases in biology-related articles. Supervised and unsupervised machine learning WSD techniques have been applied in the biomedical field with promising results. We examine here the utilisation potential of the fact – one of the special features of biological articles – that the authors of the documents are known through graph-based semi-supervised methods for the GSD task. Results Our key hypothesis is that a biologist refers to each particular gene by a fixed gene alias and this holds for the co-authors as well. To make use of the co-authorship information we decided to build the inverse co-author graph on MedLine abstracts. The nodes of the inverse co-author graph are articles and there is an edge between two nodes if and only if the two articles have a mutual author. We introduce here two methods using distances (based on the graph) of abstracts for the GSD task. We found that a disambiguation decision can be made in 85% of cases with an extremely high (99.5%) precision rate just by using information obtained from the inverse co-author graph. We incorporated the co-authorship information into two GSD systems in order to attain full coverage and in experiments our procedure achieved precision of 94.3%, 98.85%, 96.05% and 99.63% on the human, mouse, fly and yeast GSD evaluation sets, respectively. Conclusion Based on the promising results obtained so far we suggest that the co-authorship information and the circumstances of the articles' release (like the title of the journal, the year of publication) can be a crucial building block of any sophisticated similarity measure among biological articles and hence the methods introduced here should be useful for other biomedical natural language processing tasks (like organism or target disease detection) as well.</p

Springer - Publisher Connector

Southampton (e-Prints Soton)

Michigan molecular interactions r2: from interacting proteins to pathways

Author: A. Ade
A. Bookvich
A. Chapman
A. Ozgur
B. Athey
B. Mirel
Bader
Birkland
Chen
Consortium
D. Radev
D. States
Finn
H. V. Jagadish
Hermjakob
J. Cavalcoli
J. Gao
J. Patel
Joshi-Tope
Kersey
Kim
Liebel
M. Jayapandian
Maglott
Mulder
Parrish
Peri
Sasson
Shannon
Stark
Stelzl
T. Weymouth
V. G. Tarcea
V. Mahavisno
Wiwatwattana
Xenarios
Y. Tian
Z. Wright
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Molecular interaction data exists in a number of repositories, each with its own data format, molecule identifier and information coverage. Michigan molecular interactions (MiMI) assists scientists searching through this profusion of molecular interaction data. The original release of MiMI gathered data from well-known protein interaction databases, and deep merged this information while keeping track of provenance. Based on the feedback received from users, MiMI has been completely redesigned. This article describes the resulting MiMI Release 2 (MiMIr2). New functionality includes extension from proteins to genes and to pathways; identification of highlighted sentences in source publications; seamless two-way linkage with Cytoscape; query facilities based on MeSH/GO terms and other concepts; approximate graph matching to find relevant pathways; support for querying in bulk; and a user focus-group driven interface design. MiMI is part of the NIH's; National Center for Integrative Biomedical Informatics (NCIBI) and is publicly available at: http://mimi.ncibi.org

CiteSeerX

EMAGE mouse embryo spatial gene expression database: 2010 update

Author: Bairoch
Bard
Bult
Christiansen
Davidson
Deutsch
Duncan R. Davidson
Fisher
Harris
Hubbard
Jeffrey H. Christiansen
Jianguo Rao
Kanehisa
Lorna Richardson
Maglott
Malcolm Fisher
Nicholas Burton
Peter Stevenson
Pruitt
Richard A. Baldock
Sayers
Shanmugasundaram Venkataraman
Sharpe
Smedley
Smith
Sprague
Tamplin
Venkataraman
Visel
Wilming
Yiya Yang
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

EMAGE (http://www.emouseatlas.org/emage) is a freely available online database of in situ gene expression patterns in the developing mouse embryo. Gene expression domains from raw images are extracted and integrated spatially into a set of standard 3D virtual mouse embryos at different stages of development, which allows data interrogation by spatial methods. An anatomy ontology is also used to describe sites of expression, which allows data to be queried using text-based methods. Here, we describe recent enhancements to EMAGE including: the release of a completely re-designed website, which offers integration of many different search functions in HTML web pages, improved user feedback and the ability to find similar expression patterns at the click of a button; back-end refactoring from an object oriented to relational architecture, allowing associated SQL access; and the provision of further access by standard formatted URLs and a Java API. We have also increased data coverage by sourcing from a greater selection of journals and developed automated methods for spatial data annotation that are being applied to spatially incorporate the genome-wide (∼19 000 gene) ‘EURExpress’ dataset into EMAGE

CiteSeerX

Edinburgh Research Explorer

Automatic Assignment of EC Numbers

Author: A McNaught
AJ Barrett
CH Wu
D Latino
D Maglott
Dietmar Schomburg
H Ma
Herbert M. Sauro
I Schomburg
Ida Schomburg
J Apostolakis
JS Edwards
M Hattori
M Kotera
NM Luscombe
P Willet
R Caspi
R Körner
S Goto
S Schmidt
TJ Hubbard
Volker Egelhofer
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

A wide range of research areas in molecular biology and medical biochemistry require a reliable enzyme classification system, e.g., drug design, metabolic network reconstruction and system biology. When research scientists in the above mentioned areas wish to unambiguously refer to an enzyme and its function, the EC number introduced by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) is used. However, each and every one of these applications is critically dependent upon the consistency and reliability of the underlying data for success. We have developed tools for the validation of the EC number classification scheme. In this paper, we present validated data of 3788 enzymatic reactions including 229 sub-subclasses of the EC classification system. Over 80% agreement was found between our assignment and the EC classification. For 61 (i.e., only 2.5%) reactions we found that their assignment was inconsistent with the rules of the nomenclature committee; they have to be transferred to other sub-subclasses. We demonstrate that our validation results can be used to initiate corrections and improvements to the EC number classification scheme

Identifying hypothetical genetic influences on complex disease phenotypes

Author: A Hamosh
Benjamin J Keller
BL Chang
C Relton
D Hristovski
D Maglott
DH Xiong
FS Turner
G Grimes
J Pandey
K Matsuo
LJ Scott
M Oti
NJ Cox
RE Urwin
Richard C McEachin
S Aerts
S Ekins
S Lee
SH Wu
VU Onay
WJ Kent
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Statistical interactions between disease-associated loci of complex genetic diseases suggest that genes from these regions are involved in a common mechanism impacting, or impacted by, the disease. The computational problem we address is to discover relationships among genes from these interacting regions that may explain the observed statistical interaction and the role of these genes in the disease phenotype. Results We describe a heuristic algorithm for generating hypothetical gene relationships from loci associated with a complex disease phenotype. This approach, called Prioritizing Disease Genes by Analysis of Common Elements (PDG-ACE), mines biomedical keywords from text descriptions of genes and uses them to relate genes close to disease-associated loci. A keyword common to, and significantly over-represented in, a pair of gene descriptions may represent a preliminary hypothesis about the biological relationship between the genes, and suggest the role the genes play in the disease phenotype. Conclusion Our experimentation shows that the approach finds previously published relationships, while failing to find relationships that don't exist. The results also indicate that the approach is robust to differences in keyword vocabulary. We outline a brief case study in which results from a recently published Type 2 Diabetes association study are used to identify potential hypotheses.</p

Springer - Publisher Connector