Search CORE

25 research outputs found

Predicting residue-wise contact orders in proteins by support vector regression

Author: A Bairoch
AG Murzin
AR Kinjo
AR Kinjo
AR Kinjo
AR Kinjo
B Rost
CH Tsai
D Kihara
D Sarda
DT Jones
G Pollastri
G Pollastri
GP Raghava
HM Berman
J Song
J Wang
Jiangning Song
JM Chandonia
Kevin Burrage
KW Plaxco
M Punta
MPS Brown
NP Prabhu
S Ahmad
S Hua
S Hua
V Vapnik
V Vapnik
W Kabsch
W Liu
X Wang
Z Yuan
Z Yuan
Z Yuan
Z Yuan
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. RESULTS: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. CONCLUSION: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Queensland University of Technology ePrints Archive

UQ eSpace (University of Queensland)

Improving the performance of DomainDiscovery of protein domain boundary assignment using inter-domain linker index

Author: A Andreeva
Abdur R Sikder
Albert Y Zomaya
AR Sikder
FMG Pearl
G Pollastri
G Pollastri
HM Berman
J Cheng
J Liu
J Sim
JE Gewehr
L Kong
M Dumontier
M Suyama
N Nagarajan
OV Galzitskaya
RA George
RL Marsden
S Veretnik
SF Altschul
SJ Wheelan
T Joachims
TA Holland
V Vapnik
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Knowledge of protein domain boundaries is critical for the characterisation and understanding of protein function. The ability to identify domains without the knowledge of the structure – by using sequence information only – is an essential step in many types of protein analyses. In this present study, we demonstrate that the performance of DomainDiscovery is improved significantly by including the inter-domain linker index value for domain identification from sequence-based information. Improved DomainDiscovery uses a Support Vector Machine (SVM) approach and a unique training dataset built on the principle of consensus among experts in defining domains in protein structure. The SVM was trained using a PSSM (Position Specific Scoring Matrix), secondary structure, solvent accessibility information and inter-domain linker index to detect possible domain boundaries for a target sequence. RESULTS: Improved DomainDiscovery is compared with other methods by benchmarking against a structurally non-redundant dataset and also CASP5 targets. Improved DomainDiscovery achieves 70% accuracy for domain boundary identification in multi-domains proteins. CONCLUSION: Improved DomainDiscovery compares favourably to the performance of other methods and excels in the identification of domain boundaries for multi-domain proteins as a result of introducing support vector machine with benchmark_2 dataset

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

svmPRAT: SVM-based Protein Residue Annotation Toolkit

Author: A Kernytsky
AG de Brevern
AG Murzin
AK Dunker
AR Kinjo
B Rost
C Etchebest
C Kauffman
Christopher Kauffman
DT Jones
DT Jones
G Karypis
G Pollastri
G Pollastri
GE Crooks
George Karypis
H Rangwala
Huzefa Rangwala
J Cheng
J Cheng
M Gribskov
O Noivirit-Brik
R Ahmed
R Karchin
R Sanchez
RC Whaley
S Ahmad
S Hirose
SF Altschul
T Joachims
T Schwede
V Vapnik
VN Vapnik
W Kabsch
Y Ofran
Z Dosztnyi
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Over the last decade several prediction methods have been developed for determining the structural and functional properties of individual protein residues using sequence and sequence-derived information. Most of these methods are based on support vector machines as they provide accurate and generalizable prediction models. Results We present a general purpose protein residue annotation toolkit (<it>svm</it><monospace>PRAT</monospace>) to allow biologists to formulate residue-wise prediction problems. <it>svm</it><monospace>PRAT</monospace> formulates the annotation problem as a classification or regression problem using support vector machines. One of the key features of <it>svm</it><monospace>PRAT</monospace> is its ease of use in incorporating any user-provided information in the form of feature matrices. For every residue <it>svm</it><monospace>PRAT</monospace> captures local information around the reside to create fixed length feature vectors. <it>svm</it><monospace>PRAT</monospace> implements accurate and fast kernel functions, and also introduces a flexible window-based encoding scheme that accurately captures signals and pattern for training effective predictive models. Conclusions In this work we evaluate <it>svm</it><monospace>PRAT</monospace> on several classification and regression problems including disorder prediction, residue-wise contact order estimation, DNA-binding site prediction, and local structure alphabet prediction. <it>svm</it><monospace>PRAT</monospace> has also been used for the development of state-of-the-art transmembrane helix prediction method called TOPTMH, and secondary structure prediction method called YASSPP. This toolkit developed provides practitioners an efficient and easy-to-use tool for a wide variety of annotation problems. <it>Availability</it>: <url>http://www.cs.gmu.edu/~mlbio/svmprat</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Improved general regression network for protein domain boundary prediction

Author: A Ceroni
A Vieira
Abdur R Sikder
AK Jain
Albert Y Zomaya
AR Sikder
AR Sikder
Bing Bing Zhou
C Chothia
C Civera
CC Lee
CR Robinson
DB Wetlaufer
FMG Pearl
G Pollastri
G Pollastri
HC Van Leeuwen
HM Berman
J Chen
J Cheng
J Liu
J Sim
JCB Melo
JE Gewehr
JS Richardson
JSR Jang
M Dumontier
M Dumontier
M Suyama
MJ Lehtinen
N Nagarajan
OV Galzitskaya
P Baldi
P Bork
Paul D Yoo
RA George
RE Schapire
RL Marsden
RR Copley
RR Joshi
RS Gokhale
S Prompramote
S Veretnik
SF Altschul
TA Holland
Y Freund
Publication venue: BioMed Central
Publication date: 13/02/2008
Field of study

Background: Protein domains present some of the most useful information that can be used to understand protein structure and functions. Recent research on protein domain boundary prediction has been mainly based on widely known machine learning techniques, such as Artificial Neural Networks and Support Vector Machines. In this study, we propose a new machine learning model (IGRN) that can achieve accurate and reliable classification, with significantly reduced computations. The IGRN was trained using a PSSM (Position Specific Scoring Matrix), secondary structure, solvent accessibility information and inter-domain linker index to detect possible domain boundaries for a target sequence. Results: The proposed model achieved average prediction accuracy of 67% on the Benchmark_2 dataset for domain boundary identification in multi-domains proteins and showed superior predictive performance and generalisation ability among the most widely used neural network models. With the CASP7 benchmark dataset, it also demonstrated comparable performance to existing domain boundary predictors such as DOMpro, DomPred, DomSSEA, DomCut and DomainDiscovery with 70.10% prediction accuracy. Conclusion: The performance of proposed model has been compared favourably to the performance of other existing machine learning based methods as well as widely known domain boundary predictors on two benchmark datasets and excels in the identification of domain boundaries in terms of model bias, generalisation and computational requirements. © 2008 Yoo et al; licensee BioMed Central Ltd

Crossref

Michigan Technological University

PubMed Central

Analysis and Prediction of Translation Rate Based on Sequence and Functional Features of the mRNA

Author: AR Gruber
C Chothia
C Ding
D Charif
D Greenbaum
F Gebauer
G Kudla
G Lithwick
G Pollastri
G Pollastri
Grzegorz Kudla
GV Glass
H Liljenstrom
H Peng
Hai-Peng Li
I Dubchak
JE Bergmann
JL Fauchere
JP Le Quesne
Kai-Yan Feng
KC Chou
KC Chou
KC Chou
KC Chou
L Nie
LJ Jensen
M Charton
M Ringner
MA Gilchrist
MP Washburn
NT Ingolia
O Shalem
P Carmona-Saez
P Lu
PM Sharp
Q Tian
R Brockmann
R Grantham
S Galban
S Ghaemmaghami
S Varenne
Sibao Wan
SP Gygi
SS Dwight
T Huang
T Huang
T Huang
T Kawai
T Tuller
T Tuller
Tao Huang
Xiangyin Kong
Y Osada
Yu-Dong Cai
Yufang Zheng
Zhongping Xu
Publication venue: Public Library of Science
Publication date: 06/01/2011
Field of study

Protein concentrations depend not only on the mRNA level, but also on the translation rate and the degradation rate. Prediction of mRNA's translation rate would provide valuable information for in-depth understanding of the translation mechanism and dynamic proteome. In this study, we developed a new computational model to predict the translation rate, featured by (1) integrating various sequence-derived and functional features, (2) applying the maximum relevance & minimum redundancy method and incremental feature selection to select features to optimize the prediction model, and (3) being able to predict the translation rate of RNA into high or low translation rate category. The prediction accuracies under rich and starvation condition were 68.8% and 70.0%, respectively, evaluated by jackknife cross-validation. It was found that the following features were correlated with translation rate: codon usage frequency, some gene ontology enrichment scores, number of RNA binding proteins known to bind its mRNA product, coding sequence length, protein abundance and 5′UTR free energy. These findings might provide useful information for understanding the mechanisms of translation and dynamic proteome. Our translation rate prediction model might become a high throughput tool for annotating the translation rate of mRNAs in large-scale

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Protein structure search and local structure characterization

Author: A Andreeva
AC Camproux
AG de Brevern
AG de Brevern
AG de Brevern
AR Ortiz
B Offmann
B Rost
C Benros
C Bystroff
CA Orengo
D Baker
E Appella
F Birzele
F Guyon
G Pollastri
HM Berman
IN Shindyalo
J Garnier
J Schuchhardt
J Vesanto
JA Hartigan
JM Yang
JS Fetrow
L Holm
M Carpentier
M Dudev
M Tyagi
M Tyagi
M Tyagi
NJ Mulder
O Sander
R Unger
S Henikoff
Shih-Yen Ku
T Madej
TL Bailey
TM Mitchell
TN Petersen
U Hobohm
VS Gowri
W Humphrey
WM Zheng
WR Pearson
Y Liu
Y Ye
Yuh-Jyh Hu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Structural similarities among proteins can provide valuable insight into their functional mechanisms and relationships. As the number of available three-dimensional (3D) protein structures increases, a greater variety of studies can be conducted with increasing efficiency, among which is the design of protein structural alphabets. Structural alphabets allow us to characterize local structures of proteins and describe the global folding structure of a protein using a one-dimensional (1D) sequence. Thus, 1D sequences can be used to identify structural similarities among proteins using standard sequence alignment tools such as BLAST or FASTA. Results We used self-organizing maps in combination with a minimum spanning tree algorithm to determine the optimum size of a structural alphabet and applied the k-means algorithm to group protein fragnts into clusters. The centroids of these clusters defined the structural alphabet. We also developed a flexible matrix training system to build a substitution matrix (TRISUM-169) for our alphabet. Based on FASTA and using TRISUM-169 as the substitution matrix, we developed the SA-FAST alignment tool. We compared the performance of SA-FAST with that of various search tools in database-scale search tasks and found that SA-FAST was highly competitive in all tests conducted. Further, we evaluated the performance of our structural alphabet in recognizing specific structural domains of EGF and EGF-like proteins. Our method successfully recovered more EGF sub-domains using our structural alphabet than when using other structural alphabets. SA-FAST can be found at <url>http://140.113.166.178/safast/</url>. Conclusion The goal of this project was two-fold. First, we wanted to introduce a modular design pipeline to those who have been working with structural alphabets. Secondly, we wanted to open the door to researchers who have done substantial work in biological sequences but have yet to enter the field of protein structure research. Our experiments showed that by transforming the structural representations from 3D to 1D, several 1D-based tools can be applied to structural analysis, including similarity searches and structural motif finding.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Automated Alphabet Reduction for Protein Datasets

Author: AD Solis
AD Solis
AD Solis
Alfonso Valencia
AR Kinjo
B Rost
C Etchebest
C Sander
CD Livingstone
F Melo
G Harik
G Pollastri
G Venturini
J Bacardit
J Bacardit
J Bacardit
J Bacardit
J Meiler
J Mintseris
J Wang
Jaume Bacardit
JO Wrabl
Jonathan D Hirst
JY Wang
K Yue
KA Dill
KM Misura
LR Murphy
M Cieplak
M Gribskov
M Stout
Michael Stout
MJ Wood
MS Cline
N Krasnogor
Natalio Krasnogor
O Dor
Robert E Smith
S Akanuma
S Henikoff
S Kamtekar
S Kullback
S Miyazawa
S Qin
SF Altschul
T Li
T Noguchi
TM Cover
W Kabsch
X Liu
Y Ikenaka
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background We investigate automated and generic alphabet reduction techniques for protein structure prediction datasets. Reducing alphabet cardinality without losing key biochemical information opens the door to potentially faster machine learning, data mining and optimization applications in structural bioinformatics. Furthermore, reduced but informative alphabets often result in, e.g., more compact and human-friendly classification/clustering rules. In this paper we propose a robust and sophisticated alphabet reduction protocol based on mutual information and state-of-the-art optimization techniques. Results We applied this protocol to the prediction of two protein structural features: contact number and relative solvent accessibility. For both features we generated alphabets of two, three, four and five letters. The five-letter alphabets gave prediction accuracies statistically similar to that obtained using the full amino acid alphabet. Moreover, the automatically designed alphabets were compared against other reduced alphabets taken from the literature or human-designed, outperforming them. The differences between our alphabets and the alphabets taken from the literature were quantitatively analyzed. All the above process had been performed using a primary sequence representation of proteins. As a final experiment, we extrapolated the obtained five-letter alphabet to reduce a, much richer, protein representation based on evolutionary information for the prediction of the same two features. Again, the performance gap between the full representation and the reduced representation was small, showing that the results of our automated alphabet reduction protocol, even if they were obtained using a simple representation, are also able to capture the crucial information needed for state-of-the-art protein representations. Conclusion Our automated alphabet reduction protocol generates competent reduced alphabets tailored specifically for a variety of protein datasets. This process is done without any domain knowledge, using information theory metrics instead. The reduced alphabets contain some unexpected (but sound) groups of amino acids, thus suggesting new ways of interpreting the data.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UCL Discovery

Prodepth: Predict Residue Depth by Support Vector Regression Approach from Protein Sequences Only

Author: A Pintar
A Pintar
A Pintar
A Schlessinger
A Schlessinger
A Schlessinger
A Schlessinger
A Shrake
AG Murzin
AR Kinjo
AR Kinjo
Ashley M. Buckle
B Lee
B Rost
B Rost
B Rost
C Chothia
CK Smith
D Baker
D Varrazzo
D Xie
DT Jones
DT Jones
E Schmitt
EM Marcotte
F Ferre
G Pollastri
Geoffrey I. Webb
GP Raghava
H Chen
H Zhang
H Zhou
Hao Tan
HM Berman
J Cheng
J Cheng
J Qiu
J Song
J Song
J Song
J Song
J Wan
James C. Whisstock
JC Whisstock
Jiangning Song
JJ Ward
JM Chandonia
JU Bowie
K Bajaj
K Chen
K Vlahovicek
Khalid Mahmood
L Kurgan
LA Kurgan
M Connolly
M Kumar
M Lee
M Stout
ME Lacombe-Harvey
MK Kalita
MN Nguyen
O Schueler-Furman
P Radivojac
RG Coleman
Ruby H. P. Law
S Ahmad
S Chakravarty
S Liu
S Miller
Sean David Mooney
SF Altschul
T Hamelryck
T Ishida
T Joachims
T Noguchi
Tatsuya Akutsu
TL Blundell
V Vapnik
V Vapnik
W Kabsch
W Liu
W Zhang
WL DeLano
X Wang
Y Bromberg
Y Kalidas
Y Ofran
Y Ofran
Z Yuan
Z Yuan
ZX Wang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Residue depth (RD) is a solvent exposure measure that complements the information provided by conventional accessible surface area (ASA) and describes to what extent a residue is buried in the protein structure space. Previous studies have established that RD is correlated with several protein properties, such as protein stability, residue conservation and amino acid types. Accurate prediction of RD has many potentially important applications in the field of structural bioinformatics, for example, facilitating the identification of functionally important residues, or residues in the folding nucleus, or enzyme active sites from sequence information. In this work, we introduce an efficient approach that uses support vector regression to quantify the relationship between RD and protein sequence. We systematically investigated eight different sequence encoding schemes including both local and global sequence characteristics and examined their respective prediction performances. For the objective evaluation of our approach, we used 5-fold cross-validation to assess the prediction accuracies and showed that the overall best performance could be achieved with a correlation coefficient (CC) of 0.71 between the observed and predicted RD values and a root mean square error (RMSE) of 1.74, after incorporating the relevant multiple sequence features. The results suggest that residue depth could be reliably predicted solely from protein primary sequences: local sequence environments are the major determinants, while global sequence features could influence the prediction performance marginally. We highlight two examples as a comparison in order to illustrate the applicability of this approach. We also discuss the potential implications of this new structural parameter in the field of protein structure prediction and homology modeling. This method might prove to be a powerful tool for sequence analysis

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Deciphering the Preference and Predicting the Viability of Circular Permutations in Proteins

Author: A Bakan
A Chakrabartty
A Guerler
A Guerler
A Jeltsch
A Kuzmanic
A Pintar
AC Wallace
AE Todd
AR Panchenko
AR van Erkel
AS Aranko
B Anand
B Halle
B Lee
BA Cunningham
BE Jones
C Pommie
C Vogel
CC Chang
CH Lu
CH Shih
CJ Crasto
CP Lin
CP Ponting
D Bordo
DA Case
Darren R. Flower
DL Nelson
DM Carrington
EA Ribeiro Jr
ESC Shih
FH Arnold
G Amitai
G Bulaj
G Pollastri
GS Baird
H Iwai
H Zhang
HK Liang
HM Berman
I Bahar
I Remy
J Chen
J Hennecke
J Weiner III
J Zhu
JD Pedelacq
Jenn-Kang Hwang
JM Bujnicki
JM Word
JM Yang
JR Quinlan
K Nishikawa
KH Paszkiewicz
L Chen
L Li
LC Tsai
LG Gebhard
Li-Fen Wang
M Elarabaty
M Iwakura
M Kojima
M Ostermeier
M Paluszewski
M Zavodszky
ML Connolly
MN Nguyen
PC Lyu
Ping-Chiang Lyu
PJ Werbos
R Garrett
R Vandrunen
RJ Moreau
S Akanuma
S Hovmoller
S Kundu
S Topell
S Uliel
S Uliel
SF Betz
SG Peisajovich
SJ Hubbard
ST Hsu
T Haliloglu
T Hesterberg
T Nakamura
T Noguchi
Tian Dai
TU Schwartz
V Anantharaman
V Muralidharan
W Kabsch
W Li
W Zheng
WC Lo
WC Lo
WC Lo
Wei-Cheng Lo
WR Pearson
Y Lindqvist
Y Yu
Y Zhang
Yen-Yi Liu
Z Qian
Publication venue: Public Library of Science
Publication date: 16/02/2012
Field of study

Circular permutation (CP) refers to situations in which the termini of a protein are relocated to other positions in the structure. CP occurs naturally and has been artificially created to study protein function, stability and folding. Recently CP is increasingly applied to engineer enzyme structure and function, and to create bifunctional fusion proteins unachievable by tandem fusion. CP is a complicated and expensive technique. An intrinsic difficulty in its application lies in the fact that not every position in a protein is amenable for creating a viable permutant. To examine the preferences of CP and develop CP viability prediction methods, we carried out comprehensive analyses of the sequence, structural, and dynamical properties of known CP sites using a variety of statistics and simulation methods, such as the bootstrap aggregating, permutation test and molecular dynamics simulations. CP particularly favors Gly, Pro, Asp and Asn. Positions preferred by CP lie within coils, loops, turns, and at residues that are exposed to solvent, weakly hydrogen-bonded, environmentally unpacked, or flexible. Disfavored positions include Cys, bulky hydrophobic residues, and residues located within helices or near the protein's core. These results fostered the development of an effective viable CP site prediction system, which combined four machine learning methods, e.g., artificial neural networks, the support vector machine, a random forest, and a hierarchical feature integration procedure developed in this work. As assessed by using the hydrofolate reductase dataset as the independent evaluation dataset, this prediction system achieved an AUC of 0.9. Large-scale predictions have been performed for nine thousand representative protein structures; several new potential applications of CP were thus identified. Many unreported preferences of CP are revealed in this study. The developed system is the best CP viability prediction method currently available. This work will facilitate the application of CP in research and biotechnology

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute

Trends in template/fragment-free protein structure prediction

Author: A BenNaim
A Cavalli
A Elofsson
A Grossfield
A Jagielska
A Liwo
A Liwo
A Pillardy
A Warshel
A Warshel
A Warshel
AE Roitberg
AF Voter
AP Lyubartsev
AR Ortiz
AR Panchenko
AV Morozov
B Fain
B Roux
B Xue
B Zagrovic
BR Brooks
C Alsenoy Van
C Bystroff
C Hardin
C Hoppe
C Simmerling
C Simmerling
C Zhang
C Zhang
C Zhang
C Zhang
C Zhang
CL Brooks
CM Deane
CM Summa
D Chivian
D Eramian
D Gilis
D Hamelberg
D Jiao
D Katagiri
D Kihara
D Kim
DE Kim
DE Shaw
DS Wishart
DT Jones
E Faraggi
E Faraggi
E Ferrada
E Ferrada
E Haber
E Krieger
E Marinari
E Pettersson
Eshel Faraggi
F Wagner
F Zhao
F Zhao
F Zhao
FG Wang
G Chopra
G Cornilescu
G Pollastri
G Yona
GA Kaminski
GA Papoian
GM Torrie
GR Bowman
H Fan
H Kamberaj
H Kamisetty
H Lu
H Zhou
H Zhou
Hongxing Lei
HP Gong
HS Kang
HX Lei
HX Lei
HX Lei
HX Lei
HX Lei
HY Liu
HY Zhou
HZ Li
J Cheng
J DeBartolo
J DeBartolo
J Lundstrom
J Meiler
J Moult
J Pei
J Shi
J Skolnick
J Vreede
J Wang
J Xu
J Zhu
J Zhu
JA Hegler
JA McCammon
JA McCammon
JA Vila
JE Stone
JF Gibrat
JL Gao
JL Knight
JM Bujnicki
JM Bujnicki
JP Ma
JP Piquemal
JW Pitera
K Karplus
KT Simons
LA Kelley
LC Song
LJ Yang
LJ Yang
LJ Yang
LQ Zheng
M Ben-David
M Challacombe
M Christen
M Lu
M Lu
M Masella
M Mirzaie
M Nanias
M Stork
M Vieth
MJ Rooman
MJ Sippl
MM Seibert
MR Betancourt
MR Lee
MS Friedrichs
MS Lin
MS Shell
MY Shen
N Todorova
N Yu
N Yu
NV Buchete
O Dor
O Dor
O Zimmermann
P Bradley
P Robustelli
P Sherwood
PA Bash
PD Renfrew
PD Thomas
PEM Lopes
PH Maccallum
PH Maccallum
PI Bakker de
R Kuang
R Paulini
R Samudrala
R Srinivasan
RW Montalvao
S Brown
S Chowdhury
S Kannan
S Liu
S Miyazawa
S Miyazawa
S Neal
S Oldziej
S Patel
S Piana
S Piana
S Roy
S Tanaka
SB Ozkan
SF Altschul
SJ Weiner
T Hamelryck
T Kortemme
T Lazaridis
T Yoshidome
TC Terwilliger
TJ Brunette
U Ryde
UHE Hansmann
V Leone
V Tozzini
V Tozzini
V Tsui
VA Eyrich
W Blokzijl
W Boomsma
W Xie
W Zhang
WS Xie
WW Chen
X Zhu
XF Li
XP Xu
Y Duan
Y Duan
Y Shan
Y Shen
Y Shen
Y Sugita
Y Zhang
Y Zhang
Y Zhou
Yaoqi Zhou
YD Yang
YD Yang
YD Yang
YG Mu
YH Tan
YH Wu
Yong Duan
YQ Gao
YQ Gao
Yuedong Yang
YX Liu
Z Wang
ZX Wang
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California