Search CORE

64 research outputs found

Prediction of protein motions from amino acid sequence and its application to protein-protein interaction

Author: Hiroshi Wako
Kiyonobu Yokota
Satoru Kanai
Shigeru Endo
Shuichi Hirose
Tamotsu Noguchi
Yutaka Kuroda
Publication venue: Springer Nature
Publication date: 13/07/2010
Field of study

BACKGROUND: Structural flexibility is an important characteristic of proteins because it is often associated with their function. The movement of a polypeptide segment in a protein can be broken down into two types of motions: internal and external ones. The former is deformation of the segment itself, but the latter involves only rotational and translational motions as a rigid body. Normal Model Analysis (NMA) can derive these two motions, but its application remains limited because it necessitates the gathering of complete structural information. RESULTS: In this work, we present a novel method for predicting two kinds of protein motions in ordered structures. The prediction uses only information from the amino acid sequence. We prepared a dataset of the internal and external motions of segments in many proteins by application of NMA. Subsequently, we analyzed the relation between thermal motion assessed from X-ray crystallographic B-factor and internal/external motions calculated by NMA. Results show that attributes of amino acids related to the internal motion have different features from those related to the B-factors, although those related to the external motion are correlated strongly with the B-factors. Next, we developed a method to predict internal and external motions from amino acid sequences based on the Random Forest algorithm. The proposed method uses information associated with adjacent amino acid residues and secondary structures predicted from the amino acid sequence. The proposed method exhibited moderate correlation between predicted internal and external motions with those calculated by NMA. It has the highest prediction accuracy compared to a naïve model and three published predictors. CONCLUSIONS: Finally, we applied the proposed method predicting the internal motion to a set of 20 proteins that undergo large conformational change upon protein-protein interaction. Results show significant overlaps between the predicted high internal motion regions and the observed conformational change regions

Springer - Publisher Connector

PubMed Central

Predicting mostly disordered proteins by using structure-unknown protein data

Author: AK Dunker
AK Dunker
AK Dunker
AL Fink
CJ Oldfield
DT Jones
E Garner
EA Weathers
HJ Dyson
J Prilusky
JJ Ward
JJ Ward
JW Chen
Kana Shimizu
Kentaro Tomii
LM Iakoucheva
MJ Zvelebil
NS Bogatyreva
P Romero
P Tompa
P Tompa
PE Wright
R Apweiler
R Linding
R Linding
S Vucetic
S Vucetic
Shuichi Hirose
SO Garbuzynskiy
T Joachims
Tamotsu Noguchi
V Receveur-Brechot
VN Uversky
VN Uversky
VN Uversky
X Li
Y Minezaki
Yoichi Muraoka
Z Dosztanyi
Z Obradovic
ZR Yang
Publication venue: BioMed Central
Publication date: 01/03/2007
Field of study

BACKGROUND: Predicting intrinsically disordered proteins is important in structural biology because they are thought to carry out various cellular functions even though they have no stable three-dimensional structure. We know the structures of far more ordered proteins than disordered proteins. The structural distribution of proteins in nature can therefore be inferred to differ from that of proteins whose structures have been determined experimentally. We know many more protein sequences than we do protein structures, and many of the known sequences can be expected to be those of disordered proteins. Thus it would be efficient to use the information of structure-unknown proteins in order to avoid training data sparseness. We propose a novel method for predicting which proteins are mostly disordered by using spectral graph transducer and training with a huge amount of structure-unknown sequences as well as structure-known sequences. RESULTS: When the proposed method was evaluated on data that included 82 disordered proteins and 526 ordered proteins, its sensitivity was 0.723 and its specificity was 0.977. It resulted in a Matthews correlation coefficient 0.202 points higher than that obtained using FoldIndex, 0.221 points higher than that obtained using the method based on plotting hydrophobicity against the number of contacts and 0.07 points higher than that obtained using support vector machines (SVMs). To examine robustness against training data sparseness, we investigated the correlation between two results obtained when the method was trained on different datasets and tested on the same dataset. The correlation coefficient for the proposed method is 0.14 higher than that for the method using SVMs. When the proposed SGT-based method was compared with four per-residue predictors (VL3, GlobPlot, DISOPRED2 and IUPred (long)), its sensitivity was 0.834 for disordered proteins, which is 0.052–0.523 higher than that of the per-residue predictors, and its specificity was 0.991 for ordered proteins, which is 0.036–0.153 higher than that of the per-residue predictors. The proposed method was also evaluated on data that included 417 partially disordered proteins. It predicted the frequency of disordered proteins to be 1.95% for the proteins with 5%–10% disordered sequences, 1.46% for the proteins with 10%–20% disordered sequences and 16.57% for proteins with 20%–40% disordered sequences. CONCLUSION: The proposed method, which utilizes the information of structure-unknown data, predicts disordered proteins more accurately than other methods and is less affected by training data sparseness

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Prediction of paclitaxel sensitivity by CDK1 and CDK2 activity in human breast cancer cells

Author: Aya Katayama
D Yu
D Yu
DO Morgan
E Arriola
EA Nigg
EP Mamounas
F Andre
FA Holmes
Gabriel N Hortobagyi
H Ishihara
HB Muss
Hideki Ishihara
IC Henderson
J Wehland
JF Bishop
JM Nabholtz
Keigo Gohda
KH Lu
MK Dougherty
ML Citron
MV Blagosklonny
MV Blagosklonny
Naoto T Ueno
P Giannakakou
P Potemski
P Sève
PJ van Diest
RU Jänicke
S Ohie
Satoshi Nakayama
SC Shen
Shinzaburo Noguchi
T Sudo
Takeshi Takahashi
Tamotsu Sudo
TH Wang
Tomokazu Yoshida
Tomoko Matsushima
Toshiyuki Sakai
W Meikrantz
Y Hasegawa
Yasuhiro Torikoshi
Yuko Kawasaki
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

SAHG, a comprehensive database of predicted structures of all human proteins

Author: Akinori Kidera
Altschul
Andreeva
Apic
Chandonia
Cheng
Chie Motono
Consortium
Cozzetto
Deshpande
Drmanac
Dunker
Dunker
Dyson
Ebina
Grant
Grasso
Henrick
Hidekazu Hiroaki
Hunter
Ikeguchi
Junichi Nakata
Kana Shimizu
Katritch
Kengo Kinoshita
Kentaro Tomii
Keshava Prasad
Kiefer
Kim
Kinoshita
Kinoshita
Kiyotaka Misoo
Koike
Kopp
Krogh
MacLean
Matsuyuki Shirota
Metzker
Miwa Sato
Motonori Ota
Nagano
Naofumi Sakaya
Nelson
Nozomi Nagano
Ostman
Ota
Pieper
Pruitt
Ryotaro Koike
Sali
Shimizu
Suyama
Takayuki Amemiya
Tamotsu Noguchi
Thornton
Thornton
Tirion
Tomii
Tomii
Tsuyoshi Shirai
Wang
Ward
Xie
Zhang
Zhang
Publication venue: Oxford University Press
Publication date
Field of study

Most proteins from higher organisms are known to be multi-domain proteins and contain substantial numbers of intrinsically disordered (ID) regions. To analyse such protein sequences, those from human for instance, we developed a special protein-structure-prediction pipeline and accumulated the products in the Structure Atlas of Human Genome (SAHG) database at http://bird.cbrc.jp/sahg. With the pipeline, human proteins were examined by local alignment methods (BLAST, PSI-BLAST and Smith–Waterman profile–profile alignment), global–local alignment methods (FORTE) and prediction tools for ID regions (POODLE-S) and homology modeling (MODELLER). Conformational changes of protein models upon ligand-binding were predicted by simultaneous modeling using templates of apo and holo forms. When there were no suitable templates for holo forms and the apo models were accurate, we prepared holo models using prediction methods for ligand-binding (eF-seek) and conformational change (the elastic network model and the linear response theory). Models are displayed as animated images. As of July 2010, SAHG contains 42 581 protein-domain models in approximately 24 900 unique human protein sequences from the RefSeq database. Annotation of models with functional information and links to other databases such as EzCatDB, InterPro or HPRD are also provided to facilitate understanding the protein structure-function relationships

Crossref

PubMed Central

Variants of C-C Motif Chemokine 22 (CCL22) Are Associated with Susceptibility to Atopic Dermatitis: Case-Control Studies

Atopic dermatitis (AD) is a common inflammatory skin disease caused by multiple genetic and environmental factors. AD is characterized by the local infiltration of T helper type 2 (Th2) cells. Recent clinical studies have shown important roles of the Th2 chemokines, CCL22 and CCL17 in the pathogenesis of AD. To investigate whether polymorphisms of the CCL22 gene affect the susceptibility to AD, we conducted association studies and functional studies of the related variants. We first resequenced the CCL22 gene and found a total of 39 SNPs. We selected seven tag SNPs in the CCL22 gene, and conducted association studies using two independent Japanese populations (1st population, 916 cases and 1,032 controls; 2nd population 1,034 cases and 1,004 controls). After the association results were combined by inverse variance method, we observed a significant association at rs4359426 (meta-analysis, combined P = 9.6×10−6; OR, 0.74; 95% CI, 0.65–0.85). Functional analysis revealed that the risk allele of rs4359426 contributed to higher expression levels of CCL22 mRNA. We further examined the allelic differences in the binding of nuclear proteins by electrophoretic mobility shift assay. The signal intensity of the DNA-protein complex derived from the G allele of rs223821, which was in absolute LD with rs4359426, was higher than that from the A allele. Although further functional analyses are needed, it is likely that related variants play a role in susceptibility to AD in a gain-of-function manner. Our findings provide a new insight into the etiology and pathogenesis of AD

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*

Author: Aerts Jan
Aoki-Kinoshita Kiyoko F
Arakawa Kazuharu
Aranda Bruno
Asai Kiyoshi
Barboza Lord Hendrix
Bonnal Raoul JP
Bruskiewich Richard
Bryne Jan C
Chun Hong-Woo
Fernández José M
Funahashi Akira
Gordon Paul MK
Goto Naohisa
Groscurth Andreas
Gutteridge Alex
Holland Richard
Kano Yoshinobu
Katayama Toshiaki
Kawas Edward A
Kawashima Shuichi
Kerhornou Arnaud
Kibukawa Eri
Kinjo Akira R
Kuhn Michael
Lapp Hilmar
Lehvaslaiho Heikki
Nakamura Hiroyuki
Nakamura Yasukazu
Nakao Mitsuteru
Nishizawa Tatsuya
Nobata Chikashi
Noguchi Tamotsu
Oinn Thomas M
Okamoto Shinobu
Ono Keiichiro
Owen Stuart
Pafilis Evangelos
Pocock Matthew
Prins Pjotr
Ranzinger René
Reisinger Florian
Salwinski Lukasz
Schreiber Mark
Senger Martin
Shigemoto Yasumasa
Standley Daron M
Sugawara Hideaki
Takagi Toshihisa
Tashiro Toshiyuki
Trelles Oswaldo
Vos Rutger A
Wilkinson Mark D
Yamaguchi Atsuko
Yamamoto Yasunori
York William
Zmasek Christian M
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Wageningen University & Research Publications

eScholarship - University of California

MFSPSSMpred: identifying short disorder-to-order binding regions in disordered proteins based on contextual local evolutionary conservation

Author: A Mohan
C Cheng-Wei
C Chica
C Chih-Chung
C Yugong
Chun Fang
CJ Oldfield
D Zsuzsanna
Daisuke Tominaga
ED Norman
ED Norman
ED Norman
FA Stephen
Hayato Yamana
JH Niall
JM Marcin
JW Jonathan
K Shimizu
KL Ioly
L McGuffin
M Fuxreiter
MD Fatemeh
RC Gonzalez
S Avner
Tamotsu Noguchi
V Vacic
Z Dosztanyi
Z Tuo
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Pdb-reprdb: a database of representative protein chains from the protein data bank (pdb

Author: Tamotsu Noguchi
Yutaka Akiyama
Publication venue
Publication date: 01/01/2001
Field of study

PDB-REPRDB is a database of representative protein chains from the Protein Data Bank (PDB). Started at the Real World Computing Partnership (RWCP) in August 1997, it developed to the present system of PDB-REPRDB. In April 2001, the system was move

CiteSeerX