Search CORE

9 research outputs found

Identification of Age-Related Macular Degeneration Related Genes by Applying Shortest Path Algorithm in Protein-Protein Interaction Network

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

Crossref

Enhancing HMM-based protein profile-profile alignment with structural features and evolutionary coupling information

Author: A Biegert
A Hildebrand
A Kryshtafovych
A Kryshtafovych
A Zemla
CL Tang
DS Marks
E Faraggi
I Holmes
J Cheng
J Söding
J Söding
J Xu
JD Thompson
Jianlin Cheng
K Ginalski
K Tomii
LN Kinch
M Henn-Sax
M Remmert
N Eswar
P Bork
R Hughey
R Mott
RD Finn
SF Altschul
TA Hopf
W Kabsch
W Zhang
X Deng
Xin Deng
Y Zhang
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Identification and localization of Tospovirus genus-wide conserved residues in 3D models of the nucleocapsid and the silencing suppressor proteins

Author: Adhikari Badri
Cheng Jianlin
Olaya Cristian
Pappu Hanu R.
Raikhy Gaurav
Publication venue: IRL @ UMSL
Publication date: 01/01/2019
Field of study

Background: Tospoviruses (genus Tospovirus, family Peribunyaviridae, order Bunyavirales) cause significant losses to a wide range of agronomic and horticultural crops worldwide. Identification and characterization of specific sequences and motifs that are critical for virus infection and pathogenicity could provide useful insights and targets for engineering virus resistance that is potentially both broad spectrum and durable. Tomato spotted wilt virus (TSWV), the most prolific member of the group, was used to better understand the structure-function relationships of the nucleocapsid gene (N), and the silencing suppressor gene (NSs), coded by the TSWV small RNA. Methods: Using a global collection of orthotospoviral sequences, several amino acids that were conserved across the genus and the potential location of these conserved amino acid motifs in these proteins was determined. We used state of the art 3D modeling algorithms, MULTICOM-CLUSTER, MULTICOM-CONSTRUCT, MULTICOM-NOVEL, I-TASSER, ROSETTA and CONFOLD to predict the secondary and tertiary structures of the N and the NSs proteins. Results: We identified nine amino acid residues in the N protein among 31 known tospoviral species, and ten amino acid residues in NSs protein among 27 tospoviral species that were conserved across the genus. For the N protein, all three algorithms gave nearly identical tertiary models. While the conserved residues were distributed throughout the protein on a linear scale, at the tertiary level, three residues were consistently located in the coil in all the models. For NSs protein models, there was no agreement among the three algorithms. However, with respect to the localization of the conserved motifs, G was consistently located in coil, while H was localized in the coil in three models. Conclusions: This is the first report of predicting the 3D structure of any tospoviral NSs protein and revealed a consistent location for two of the ten conserved residues. The modelers used gave accurate prediction for N protein allowing the localization of the conserved residues. Results form the basis for further work on the structure-function relationships of tospoviral proteins and could be useful in developing novel virus control strategies targeting the conserved residues. 18 11

Directory of Open Access Journals

University of Missouri, St. Louis

FigShare

Combining Cryo-EM Density Map and Residue Contact for Protein Secondary Structure Topologies

Author: Alshammari Maytha
He Jing
Publication venue: ODU Digital Commons
Publication date: 01/01/2021
Field of study

Although atomic structures have been determined directly from cryo-EM density maps with high resolutions, current structure determination methods for medium resolution (5 to 10 Å) cryo-EM maps are limited by the availability of structure templates. Secondary structure traces are lines detected from a cryo-EM density map for α-helices and β-strands of a protein. A topology of secondary structures defines the mapping between a set of sequence segments and a set of traces of secondary structures in three-dimensional space. In order to enhance accuracy in ranking secondary structure topologies, we explored a method that combines three sources of information: a set of sequence segments in 1D, a set of amino acid contact pairs in 2D, and a set of traces in 3D at the secondary structure level. A test of fourteen cases shows that the accuracy of predicted secondary structures is critical for deriving topologies. The use of significant long-range contact pairs is most effective at enriching the rank of the maximum-match topology for proteins with a large number of secondary structures, if the secondary structure prediction is fairly accurate. It was observed that the enrichment depends on the quality of initial topology candidates in this approach. We provide detailed analysis in various cases to show the potential and challenge when combining three sources of information

Directory of Open Access Journals

Old Dominion University

Improved computational methods of protein sequence alignment, model selection and tertiary structure prediction

Author: Deng Xin
Publication venue: 'University of Missouri Libraries'
Publication date
Field of study

Protein sequence and profile alignment has been used essentially in most bioinformatics tasks such as protein structure modeling, function prediction, and phylogenetic analysis. We designed a new algorithm MSACompro to incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into multiple protein sequence alignment. Our experiments showed that it improved multiple sequence alignment accuracy over most existing methods without using the structural information and performed comparably to the method using structural features and additional homologous sequences by slightly lower scores. We also developed HHpacom, a new profile-profile pairwise alignment by integrating secondary structure, solvent accessibility, torsion angle and inferred residue pair coupling information. The evaluation showed that the secondary structure, relative solvent accessibility and torsion angle information significantly improved the alignment accuracy in comparison with the state of the art methods HHsearch and HHsuite. The evolutionary constraint information did help in some cases, especially the alignments of the proteins which are of short lengths, typically 100 to 500 residues. Protein Model selection is also a key step in protein tertiary structure prediction. We developed two SVM model quality assessment methods taking query-template alignment as input. The assessment results illustrated that this could help improve the model selection, protein structure prediction and many other bioinformatics problems. Moreover, we also developed a protein tertiary structure prediction pipeline, of which many components were built in our group's MULTICOM system. The MULTICOM performed well in the CASP10 (Critical Assessment of Techniques for Protein Structure Prediction) competition

University of Missouri: MOspace

Mutational analysis of Kabuki Syndrome patients and functional dissection of KMT2D mutations

Author: COCCIADIFERRO DARIO
Publication venue: Università di Foggia
Publication date: 01/01/2018
Field of study

The discovery of histone methyltransferase KMT2D and demethylase KDM6A genetic alterations in Kabuki Syndrome (KS) expanded and highlighted the role of histone modifiers in causing congenital anomalies and intellectual disability syndromes. KS is a rare autosomal dominant condition characterized by facial features, various organ malformations, postnatal growth deficiency, and intellectual disability. Since 2011 we performed a mutational screening of our KS cohort, that includes now 505 KS patients, by Sanger sequencing and MLPA of KMT2D, followed by KDM6A analysis in those patients resulted as KMT2Dnegative. Of these 505 patients, we identified 196/505 (39%) patients with KMT2D variants and 208 different KMT2D variations; of them 37/208 (18%) never described before. The majority of KS patients carry nonsense and splicesite variants, suggesting the loss of function, and therefore haploinsufficiency, as the likely mechanism for the KS phenotype. RT-PCR and direct sequencing on cDNA from Kabuki patients carrying KMT2D splice site variants demonstrated that these cause aberrant splicing of the corresponding transcript, resulting in a truncating and not functional translated protein. Molecular assays also showed that KMT2D mRNAs bearing premature stop codon are degraded by the nonsense mediated mRNA decay, contributing to KMT2D protein haploinsufficiency. We hypothesized that KS patients may benefit from a readthrough therapy that mediates translational suppression of nonsense variants, restoring the physiologically levels of endogenous KMT2D protein. Fourteen KMT2D nonsense variants were tested for their response to readthrough treatment through an in vitro dual reporter luciferase vector system, identifying 11/14 variants that displayed high levels of readthrough in response to gentamicin treatment. Among our cohort we identified three new cases with a mosaic variants in KMT2D gene, consisting in single nucleotide change resulting in two already reported nonsense variants, the c.13450C=/>T (p.R4484X) and the c.15061C=/>T (p.R5021X) and in a new frameshift variant, the c.3596_3597=/del (p.L1199HfsX7) KMT2D, respectively. Moreover, relevant for diagnostic and counselling purposes, we implemented a number of bioinformatics tools to assess the pathogenicity of 69 KMT2D missense variants, found overall in our cohort of 505 KS patients, and for 14 of them we adopted a combination of biochemical and cellular approaches to investigate their role and characterize their functional impact in the pathogenesis of the disease. We found 9/14 missense variants showing altered H3K4 methylation activity. We additionally assessed the impact on complex formation with WRAD protein complex, and we found that the reduced methyltransferase activity could be a consequence of lack of interaction

Archivio Istituzionale della Ricerca- Università degli Studi di Foggia

Methoden zur Vorhersage von komplexen biomolekularen Strukturen

Author: Wilms Christoph
Publication venue
Publication date: 10/02/2014
Field of study

Die erste hochaufgelöste Struktur eines Proteins wurde 1985 von John Kendrew und Max Perutz aufgelöst. Seitdem ist die experimentelle Aufklärung ein wichtiger Bestandteil der biologischen Forschung. Allerdings ist die Aufklärung der Strukturen von biomoleku- laren Komplexen sehr schwierig. Diese Strukturen sind jedoch immens wichtig für das Verständnis vieler biologischer Phänomene auf molekularer Ebene. Aus diesem Grund hat sich ein Forschungsfeld entwickelt, das computergestützte Modellierung zur Vorher- sage von biomolekularen Strukturen verwendet. In dieser Promotionsschrift sollten Methoden zur Vorhersage von komplexen biomolekularen Strukturen entwickelt werden. Diese Methoden basieren auf drei unter- schiedlichen Ansätzen: Die erste Methode wurde für Proteine entwickelt, die aus mehreren Domänen bestehen. Die Methode nutzt vorhandene Strukturen der einzelnen Domänen und experimentelle Daten, die geometrische Relationen der Domänen abbilden, und ermöglicht die Unter- suchung konformationeller Änderungen bedingt durch äußere Einflüsse, wie beispielsweise das Zuführen eines Substrates. Als Fallbeispiel wurde die Konformation des flexiblen zwei-Domänen Proteins peptidylprolyl cis/trans isomerase NIMA-interacting 1 (Pin1) untersucht, sowie die Änderung als Reaktion auf die Zugabe des Substrates polyethy- lene glycol (PEG). Die zweite Methode basiert auf dem neuen Verfahren Direct Coupling Analysis (DCA), das es ermöglicht geometrische Kontakte von Aminosäuren anhand eines multiplen Sequenzalignments (MSA) vorherzusagen. DCA nutzt eine Korrektur zur Vermeidung einer Stichprobenverzerrung bedingt durch die Auswahl der Sequenzen für das MSA. Die hier vorgestellte Optimierung ermöglicht eine robustere Vorhersage der geometrischen Kontakte. Die optimierte Methode wurde für die Analyse von Human Immunodeficiency Virus-1 Envelope Protein (HIV-1 Env) eingesetzt. Die letzte Methode wurde entwickelt, um Binderegionen des negativ geladenen Heparansulfates an Proteinen vorherzusagen. Dafür haben wir ein Modell entwickelt, das auf der elektrostatischen Wechselwirkung basiert. Die Fallbeispiele sind hier ver- schiedene Heparansulfat bindenden Proteine, wie das Chemokine CCL3 und den Hedgehog Proteinen. Insgesamt wird gezeigt, dass für verschiedene Arten von biomolekularer Strukturen und Komplexe moderne computergestützte Methoden Einsichten liefern, die im Einklang mit Experimenten stehen

Duisburg-Essen Publications Online

Machine Learning based Protein Sequence to (un)Structure Mapping and Interaction Prediction

Author: Iqbal Sumaiya
Publication venue: ScholarWorks@UNO
Publication date: 09/08/2017
Field of study

Proteins are the fundamental macromolecules within a cell that carry out most of the biological functions. The computational study of protein structure and its functions, using machine learning and data analytics, is elemental in advancing the life-science research due to the fast-growing biological data and the extensive complexities involved in their analyses towards discovering meaningful insights. Mapping of protein’s primary sequence is not only limited to its structure, we extend that to its disordered component known as Intrinsically Disordered Proteins or Regions in proteins (IDPs/IDRs), and hence the involved dynamics, which help us explain complex interaction within a cell that is otherwise obscured. The objective of this dissertation is to develop machine learning based effective tools to predict disordered protein, its properties and dynamics, and interaction paradigm by systematically mining and analyzing large-scale biological data. In this dissertation, we propose a robust framework to predict disordered proteins given only sequence information, using an optimized SVM with RBF kernel. Through appropriate reasoning, we highlight the structure-like behavior of IDPs in disease-associated complexes. Further, we develop a fast and effective predictor of Accessible Surface Area (ASA) of protein residues, a useful structural property that defines protein’s exposure to partners, using regularized regression with 3rd-degree polynomial kernel function and genetic algorithm. As a key outcome of this research, we then introduce a novel method to extract position specific energy (PSEE) of protein residues by modeling the pairwise thermodynamic interactions and hydrophobic effect. PSEE is found to be an effective feature in identifying the enthalpy-gain of the folded state of a protein and otherwise the neutral state of the unstructured proteins. Moreover, we study the peptide-protein transient interactions that involve the induced folding of short peptides through disorder-to-order conformational changes to bind to an appropriate partner. A suite of predictors is developed to identify the residue-patterns of Peptide-Recognition Domains from protein sequence that can recognize and bind to the peptide-motifs and phospho-peptides with post-translational-modifications (PTMs) of amino acid, responsible for critical human diseases, using the stacked generalization ensemble technique. The involved biologically relevant case-studies demonstrate possibilities of discovering new knowledge using the developed tools

University of New Orleans

The MULTICOM toolbox for protein structure prediction

Author: A Fiser
A Fuchs
A Kryshtafovych
A Porollo
A Roy
A Vullo
A Zemla
A Šali
AK Dunker
AN Tegge
B Monastyrskyy
B Monastyrskyy
B Petersen
B Rost
B Rost
B Wallner
BD O’Connor
BG Fox
C Cole
CL Worth
D Baker
D Baú
D Cozzetto
D Cozzetto
D Eisenberg
D Frishman
D Frishman
D Gilis
D Xu
DB Roche
E Capriotti
E Faraggi
F Ferre
FC Bernstein
G Karypis
G Lin
G Pollastri
G Pollastri
G Shackelford
H Berman
H Zhou
H Zhou
I Ezkurdia
J Cheng
J Cheng
J Cheng
J Cheng
J Cheng
J Cheng
J Dai
J Eickholt
J Kendrew
J Liu
J Moult
J Moult
J Moult
J Peng
J Sim
J Soding
J Xu
JD Thompson
JE Gewehr
Jesse Eickholt
Jianlin Cheng
Jilong Li
JJ Ward
JL MacCallum
JMG Izarzugaza
K Karplus
K Karplus
K Shimizu
K Shimizu
K Simons
L Kinch
L McGuffin
L McGuffin
L McGuffin
LM Iakoucheva
M Paluszewski
M Perutz
M Punta
M Wagner
MJ Mizianty
O Zimmermann
P Baldi
P Baldi
P Björkholm
P Bradley
P Chen
P Fariselli
P Larsson
R Adamczak
R Adamczak
RL Marsden
S Hirose
S Singh
S Wu
S Wu
T Ishida
T Zhang
TZ Sen
V Mariani
V Parthiban
W Kabsch
X Deng
X Deng
X Deng
Xin Deng
Y Li
Y Yang
Y Zhang
Y Zhang
Y Zhang
Y Zhang
Y Zhang
Z Dosztányi
Z Dosztányi
Z Wang
Z Wang
Z Wang
Z Wang
Zheng Wang
Publication venue: BMC
Publication date: 01/04/2012
Field of study

Abstract Background As genome sequencing is becoming routine in biomedical research, the total number of protein sequences is increasing exponentially, recently reaching over 108 million. However, only a tiny portion of these proteins (i.e. ~75,000 or < 0.07%) have solved tertiary structures determined by experimental techniques. The gap between protein sequence and structure continues to enlarge rapidly as the throughput of genome sequencing techniques is much higher than that of protein structure determination techniques. Computational software tools for predicting protein structure and structural features from protein sequences are crucial to make use of this vast repository of protein resources. Results To meet the need, we have developed a comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools. These tools include secondary structure prediction, solvent accessibility prediction, disorder region prediction, domain boundary prediction, contact map prediction, disulfide bond prediction, beta-sheet topology prediction, fold recognition, multiple template combination and alignment, template-based tertiary structure modeling, protein model quality assessment, and mutation stability prediction. Conclusions These tools have been rigorously tested by many users in the last several years and/or during the last three rounds of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7-9) from 2006 to 2010, achieving state-of-the-art or near performance. In order to facilitate bioinformatics research and technological development in the field, we have made the MULTICOM toolbox freely available as web services and/or software packages for academic use and scientific research. It is available at <url>http://sysbio.rnet.missouri.edu/multicom_toolbox/</url>.</p

Crossref

Directory of Open Access Journals

PubMed Central

University of Miami: Scholarship Miami