Search CORE

132 research outputs found

Composition Profiler: a tool for discovery and visualization of amino acid composition differences

Author: Dunker A Keith
Lonardi Stefano
Uversky Vladimir N
Vacic Vladimir
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Composition Profiler is a web-based tool for semi-automatic discovery of enrichment or depletion of amino acids, either individually or grouped by their physico-chemical or structural properties. Results The program takes two samples of amino acids as input: a query sample and a reference sample. The latter provides a suitable background amino acid distribution, and should be chosen according to the nature of the query sample, for example, a standard protein database (e.g. SwissProt, PDB), a representative sample of proteins from the organism under study, or a group of proteins with a contrasting functional annotation. The results of the analysis of amino acid composition differences are summarized in textual and graphical form. Conclusion As an exploratory data mining tool, our software can be used to guide feature selection for protein function or structure predictors. For classes of proteins with significant differences in frequencies of amino acids having particular physico-chemical (e.g. hydrophobicity or charge) or structural (e.g. α helix propensity) properties, Composition Profiler can be used as a rough, light-weight visual classifier.</p

Crossref

USFSP Digital Archive

IUPUIScholarWorks

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Scholar Commons - University of South Florida

The variance of identity-by-descent sharing in the Wright-Fisher model

Author: Ariel Darvasi
Bennet
Hollenbeck
Itsik Pe’er
Kong
Pier Francesco Palamara
Shai Carmi
Todd Lencz
Vladimir Vacic
Publication venue: 'Genetics Society of America'
Publication date: 12/08/2013
Field of study

Widespread sharing of long, identical-by-descent (IBD) genetic segments is a hallmark of populations that have experienced recent genetic drift. Detection of these IBD segments has recently become feasible, enabling a wide range of applications from phasing and imputation to demographic inference. Here, we study the distribution of IBD sharing in the Wright-Fisher model. Specifically, using coalescent theory, we calculate the variance of the total sharing between random pairs of individuals. We then investigate the cohort-averaged sharing: the average total sharing between one individual and the rest of the cohort. We find that for large cohorts, the cohort-averaged sharing is distributed approximately normally. Surprisingly, the variance of this distribution does not vanish even for large cohorts, implying the existence of "hyper-sharing" individuals. The presence of such individuals has consequences for the design of sequencing studies, since, if they are selected for whole-genome sequencing, a larger fraction of the cohort can be subsequently imputed. We calculate the expected gain in power of imputation by IBD, and subsequently, in power to detect an association, when individuals are either randomly selected or specifically chosen to be the hyper-sharing individuals. Using our framework, we also compute the variance of an estimator of the population size that is based on the mean IBD sharing and the variance in the sharing between inbred siblings. Finally, we study IBD sharing in an admixture pulse model, and show that in the Ashkenazi Jewish population the admixture fraction is correlated with the cohort-averaged sharing.Comment: Includes Supplementary Materia

arXiv.org e-Print Archive

Crossref

DisProt: the Database of Disordered Proteins

Author: Chen Jake
Cortese Marc S.
Dunker A. Keith
Hamilton Justin A.
LeGall Tanguy
Obradovic Zoran
Sickmeier Megan
Szabo Beata
Tantos Agnes
Tompa Peter
Uversky Vladimir N.
Vacic Vladimir
Publication venue: Oxford University Press
Publication date: 01/12/2006
Field of study

The Database of Protein Disorder (DisProt) links structure and function information for intrinsically disordered proteins (IDPs). Intrinsically disordered proteins do not form a fixed three-dimensional structure under physiological conditions, either in their entireties or in segments or regions. We define IDP as a protein that contains at least one experimentally determined disordered region. Although lacking fixed structure, IDPs and regions carry out important biological functions, being typically involved in regulation, signaling and control. Such functions can involve high-specificity low-affinity interactions, the multiple binding of one protein to many partners and the multiple binding of many proteins to one partner. These three features are all enabled and enhanced by protein intrinsic disorder. One of the major hindrances in the study of IDPs has been the lack of organized information. DisProt was developed to enable IDP research by collecting and organizing knowledge regarding the experimental characterization and the functional associations of IDPs. In addition to being a unique source of biological information, DisProt opens doors for a plethora of bioinformatics studies. DisProt is openly available at

CiteSeerX

Crossref

USFSP Digital Archive

PubMed Central

Scholar Commons - University of South Florida

The unfoldomics decade: an update on intrinsically disordered proteins

Author: Dunker A. Keith
Meng Jingwei
Obradovic Zoran
Oldfield Christopher J.
Romero Pedro
Uversky Vladimir N.
Vacic Vladimir
Walton Chen Jessica
Yang Jack Y.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Background Our first predictor of protein disorder was published just over a decade ago in the Proceedings of the IEEE International Conference on Neural Networks (Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK (1997) Identifying disordered regions in proteins from amino acid sequence. Proceedings of the IEEE International Conference on Neural Networks, 1: 90–95). By now more than twenty other laboratory groups have joined the efforts to improve the prediction of protein disorder. While the various prediction methodologies used for protein intrinsic disorder resemble those methodologies used for secondary structure prediction, the two types of structures are entirely different. For example, the two structural classes have very different dynamic properties, with the irregular secondary structure class being much less mobile than the disorder class. The prediction of secondary structure has been useful. On the other hand, the prediction of intrinsic disorder has been revolutionary, leading to major modifications of the more than 100 year-old views relating protein structure and function. Experimentalists have been providing evidence over many decades that some proteins lack fixed structure or are disordered (or unfolded) under physiological conditions. In addition, experimentalists are also showing that, for many proteins, their functions depend on the unstructured rather than structured state; such results are in marked contrast to the greater than hundred year old views such as the lock and key hypothesis. Despite extensive data on many important examples, including disease-associated proteins, the importance of disorder for protein function has been largely ignored. Indeed, to our knowledge, current biochemistry books don't present even one acknowledged example of a disorder-dependent function, even though some reports of disorder-dependent functions are more than 50 years old. The results from genome-wide predictions of intrinsic disorder and the results from other bioinformatics studies of intrinsic disorder are demanding attention for these proteins. Results Disorder prediction has been important for showing that the relatively few experimentally characterized examples are members of a very large collection of related disordered proteins that are wide-spread over all three domains of life. Many significant biological functions are now known to depend directly on, or are importantly associated with, the unfolded or partially folded state. Here our goal is to review the key discoveries and to weave these discoveries together to support novel approaches for understanding sequence-function relationships. Conclusion Intrinsically disordered protein is common across the three domains of life, but especially common among the eukaryotic proteomes. Signaling sequences and sites of posttranslational modifications are frequently, or very likely most often, located within regions of intrinsic disorder. Disorder-to-order transitions are coupled with the adoption of different structures with different partners. Also, the flexibility of intrinsic disorder helps different disordered regions to bind to a common binding site on a common partner. Such capacity for binding diversity plays important roles in both protein-protein interaction networks and likely also in gene regulation networks. Such disorder-based signaling is further modulated in multicellular eukaryotes by alternative splicing, for which such splicing events map to regions of disorder much more often than to regions of structure. Associating alternative splicing with disorder rather than structure alleviates theoretical and experimentally observed problems associated with the folding of different length, isomeric amino acid sequences. The combination of disorder and alternative splicing is proposed to provide a mechanism for easily "trying out" different signaling pathways, thereby providing the mechanism for generating signaling diversity and enabling the evolution of cell differentiation and multicellularity. Finally, several recent small molecules of interest as potential drugs have been shown to act by blocking protein-protein interactions based on intrinsic disorder of one of the partners. Study of these examples has led to a new approach for drug discovery, and bioinformatics analysis of the human proteome suggests that various disease-associated proteins are very rich in such disorder-based drug discovery targets

USFSP Digital Archive

IUPUIScholarWorks

Springer - Publisher Connector

PubMed Central

Scholar Commons - University of South Florida

Protein interaction network of alternatively spliced isoforms from brain links genetic risk factors for autism

Author: Broly Martin
Calderwood Michael A.
Corominas Castiñeira Roser
Fan Changyu
Ghamsari Lila
Hao Tong
Hill David E.
Horvath Steve
Iakoucheva Lilia M.
Kang Shuli
Korkin Dmitry
Kuang Xingyan
Lemmens Irma
Lin Guan Ning
Malhotra Dheeraj
Michaelson Jacob J.
Rodriguez Maria
Roth Frederick P.
Salehi-Ashtiani Kourosh
Sebat Jonathan
Shen Yun
Tam Stanley
Tasan Murat
Tavernier Jan
Trigg Shelly A.
Vacic Vladimir
Vidal Marc
Yang Xinping
Yi Song
Zhao Nan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/04/2021
Field of study

Increased risk for autism spectrum disorders (ASD) is attributed to hundreds of genetic loci. The convergence of ASD variants have been investigated using various approaches, including protein interactions extracted from the published literature. However, these datasets are frequently incomplete, carry biases and are limited to interactions of a single splicing isoform, which may not be expressed in the disease-relevant tissue. Here we introduce a new interactome mapping approach by experimentally identifying interactions between brain-expressed alternatively spliced variants of ASD risk factors. The Autism Spliceform Interaction Network reveals that almost half of the detected interactions and about 30% of the newly identified interacting partners represent contribution from splicing variants, emphasizing the importance of isoform networks. Isoform interactions greatly contribute to establishing direct physical connections between proteins from the de novo autism CNVs. Our findings demonstrate the critical role of spliceform networks for translating genetic knowledge into a better understanding of human diseases

Diposit Digital de la Universitat de Barcelona

Incorporating Distant Sequence Features and Radial Basis Function Networks to Identify Ubiquitin Conjugation Sites

Author: A Catic
A Hershko
A Zanzoni
AL Chernorudskiy
B Boeckmann
C Chothia
C-J Lin
CN Pang
CT Su
CW Tung
CY Ou
D Xie
DM Shien
DT Jones
GE Crooks
GZ Zhang
HM Berman
Hsin-Yi Hung
J Peng
JL Fauchere
K Bryson
K Ron
L Hicke
LJ McGuffin
M Charton
P Radivojac
R Grantham
S Ahmad
SA Chen
SF Altschul
SF Altschul
Shu-An Chen
T Gilon
TA Tatusova
TD Schneider
TL Bailey
Tzong-Yi Lee
V Vacic
Vladimir Uversky
Y-Y Ou
Yu-Yen Ou
YY Ou
Z Hu
ZR Yang
Publication venue: Public Library of Science
Publication date: 09/03/2011
Field of study

Ubiquitin (Ub) is a small protein that consists of 76 amino acids about 8.5 kDa. In ubiquitin conjugation, the ubiquitin is majorly conjugated on the lysine residue of protein by Ub-ligating (E3) enzymes. Three major enzymes participate in ubiquitin conjugation. They are – E1, E2 and E3 which are responsible for activating, conjugating and ligating ubiquitin, respectively. Ubiquitin conjugation in eukaryotes is an important mechanism of the proteasome-mediated degradation of a protein and regulating the activity of transcription factors. Motivated by the importance of ubiquitin conjugation in biological processes, this investigation develops a method, UbSite, which uses utilizes an efficient radial basis function (RBF) network to identify protein ubiquitin conjugation (ubiquitylation) sites. This work not only investigates the amino acid composition but also the structural characteristics, physicochemical properties, and evolutionary information of amino acids around ubiquitylation (Ub) sites. With reference to the pathway of ubiquitin conjugation, the substrate sites for E3 recognition, which are distant from ubiquitylation sites, are investigated. The measurement of F-score in a large window size (−20∼+20) revealed a statistically significant amino acid composition and position-specific scoring matrix (evolutionary information), which are mainly located distant from Ub sites. The distant information can be used effectively to differentiate Ub sites from non-Ub sites. As determined by five-fold cross-validation, the model that was trained using the combination of amino acid composition and evolutionary information performs best in identifying ubiquitin conjugation sites. The prediction sensitivity, specificity, and accuracy are 65.5%, 74.8%, and 74.5%, respectively. Although the amino acid sequences around the ubiquitin conjugation sites do not contain conserved motifs, the cross-validation result indicates that the integration of distant sequence features of Ub sites can improve predictive performance. Additionally, the independent test demonstrates that the proposed method can outperform other ubiquitylation prediction tools

Public Library of Science (PLOS)

Crossref

PubMed Central

Contribution of proline to the pre-structuring tendency of transient helical secondary structure elements in intrinsically disordered proteins

Author: Ahmed
Aurora
Avalos
Bin Xue
Bochkareva
Boehr
Boettcher
Boettcher
Brandl
Case
Changeux
Chewook Lee
Chi
Chi
Chuikov
Csizmok
Dancheck
Darden
Delano
Di Lello
Dyson
Fletcher
Gary W. Daughdrill
Gast
Guex
Hammes
Hemmings
Hornak
Humphrey
Jonker
Kabsch
Kim
Kim
Kini
Kussie
Kyou-Hoon Han
Lajos Kalmar
Lee
Lee
Lowe
Marsh
Mohan
Morgan
Mujtaba
Oldfield
Peter Tompa
Polverini
Radhakrishnan
Radhakrishnan
Richardson
Ryckaert
Sankararamakrishnan
Sankararamakrishnan
Sankararamakrishnan
Shangary
Theillet
Tompa
Tompa
Tsai
Uversky
Uversky
Uversky
Uversky
Vacic
Vacic
Viguera
Vladimir N. Uversky
Von Heijne
Woolfson
Wright
Yun
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Background: IDPs function without relying on three-dimensional structures. No clear rationale for such a behavior is available yet. PreSMos are transient secondary structures observed in the target-free IDPs and serve as the target-binding active motifs in IDPs. Prolines are frequently found in the flanking regions of PreSMos. Contribution of prolines to the conformational stability of the helical PreSMos in IDPs is investigated. Methods: MD simulations are performed for several IDP segments containing a helical PreSMo and the flanking prolines. To measure the influence of flanking-prolines on the structural content of a helical PreSMo calculations were done for wild type as well as for mutant segments with Pro→Asp, His, Lys, or Ala. The change in the helicity due to removal of a proline was measured both for the PreSMo region and for the flanking regions. Results: The α-helical content in ~70% of the helical PreSMos at the early stage of simulation decreases due to replacement of an N-terminal flanking proline by other residues whereas the helix content in nearly all PreSMos increases when the same replacements occur at the C-terminal flanking region. The helix destabilizing/terminating role of the C-terminal flanking prolines is more pronounced than the helix promoting effect of the N-terminal flanking prolines. General significance: This work represents a novel example demonstrating that a proline is encoded in an IDP with a defined purpose. The helical PreSMos presage their target-bound conformations. As they most likely mediate IDP-target binding via conformational selection their helical content can be an important feature for IDP function. Keywords: Flanking proline; Intrinsically disordered protein (IDP); Molecular dynamics simulation; PreSMo (Pre-Structured Motif). Copyright © 2013 Elsevier B.V. All rights reserved

Crossref

USFSP Digital Archive

Scholar Commons - University of South Florida

Rosetta FlexPepDock ab-initio: Simultaneous Folding, Docking and Refinement of Peptides onto Their Receptors

Author: A Stein
B Kuhlman
B Raveh
Barak Raveh
BR Chapados
C Hetenyi
C Katz
C Wang
CA Rohl
D Frishman
DM Fowler
DT Jones
E Petsalaki
E Petsalaki
G Moncalian
HM Berman
I Antes
I Buch
J Audie
J Guhaniyogi
JG Mandell
JJ Gray
K Abe
K Gehmlich
KL Morrison
L Parthasarathi
Lior Zimmerman
M Belitsky
M Burnier
M Hashemzadeh
M Rubinstein
MY Niv
N London
N London
Nir London
Ora Schueler-Furman
P Molek
P Vanhee
P Vanhee
P Vanhee
P Vlieghe
PA Prasad
PE Wright
R Brenke
R Das
RC Ladner
RL Dunbrack Jr
S Dutta
SA Gai
SS Sidhu
SW Crawley
T Kondo
T Pawson
U Zachariae
V Neduva
V Vacic
Vladimir N. Uversky
Y Li
YJ Im
Z Li
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Flexible peptides that fold upon binding to another protein molecule mediate a large number of regulatory interactions in the living cell and may provide highly specific recognition modules. We present Rosetta FlexPepDock ab-initio, a protocol for simultaneous docking and de-novo folding of peptides, starting from an approximate specification of the peptide binding site. Using the Rosetta fragments library and a coarse-grained structural representation of the peptide and the receptor, FlexPepDock ab-initio samples efficiently and simultaneously the space of possible peptide backbone conformations and rigid-body orientations over the receptor surface of a given binding site. The subsequent all-atom refinement of the coarse-grained models includes full side-chain modeling of both the receptor and the peptide, resulting in high-resolution models in which key side-chain interactions are recapitulated. The protocol was applied to a benchmark in which peptides were modeled over receptors in either their bound backbone conformations or in their free, unbound form. Near-native peptide conformations were identified in 18/26 of the bound cases and 7/14 of the unbound cases. The protocol performs well on peptides from various classes of secondary structures, including coiled peptides with unusual turns and kinks. The results presented here significantly extend the scope of state-of-the-art methods for high-resolution peptide modeling, which can now be applied to a wide variety of peptide-protein interactions where no prior information about the peptide backbone conformation is available, enabling detailed structure-based studies and manipulation of those interactions

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Microduplications of 16p11.2 are associated with schizophrenia

Recurrent microdeletions and microduplications of a 600 kb genomic region of chromosome 16p11.2 have been implicated in childhood-onset developmental disorders1-3. Here we report the strong association of 16p11.2 microduplications with schizophrenia in two large cohorts. In the primary sample, the microduplication was detected in 12/1906 (0.63%) cases and 1/3971 (0.03%) controls (P=1.2×10-5, OR=25.8). In the replication sample, the microduplication was detected in 9/2645 (0.34%) cases and 1/2420 (0.04%) controls (P=0.022, OR=8.3). For the series combined, microduplication of 16p11.2 was associated with 14.5-fold increased risk of schizophrenia (95% C.I. [3.3, 62]). A meta-analysis of multiple psychiatric disorders showed a significant association of the microduplication with schizophrenia, bipolar disorder and autism. The reciprocal microdeletion was associated only with autism and developmental disorders. Analysis of patient clinical data showed that head circumference was significantly larger in patients with the microdeletion compared with patients with the microduplication (P = 0.0007). Our results suggest that the microduplication of 16p11.2 confers substantial risk for schizophrenia and other psychiatric disorders, whereas the reciprocal microdeletion is associated with contrasting clinical features

Carolina Digital Repository

Small RNAs and the regulation of cis-natural antisense transcripts in Arabidopsis

Abstract Background In spite of large intergenic spaces in plant and animal genomes, 7% to 30% of genes in the genomes encode overlapping cis-natural antisense transcripts (cis-NATs). The widespread occurrence of cis-NATs suggests an evolutionary advantage for this type of genomic arrangement. Experimental evidence for the regulation of two cis-NAT gene pairs by natural antisense transcripts-generated small interfering RNAs (nat-siRNAs) via the RNA interference (RNAi) pathway has been reported in Arabidopsis. However, the extent of siRNA-mediated regulation of cis-NAT genes is still unclear in any genome. Results The hallmarks of RNAi regulation of NATs are 1) inverse regulation of two genes in a cis-NAT pair by environmental and developmental cues and 2) generation of siRNAs by cis-NAT genes. We examined Arabidopsis transcript profiling data from public microarray databases to identify cis-NAT pairs whose sense and antisense transcripts show opposite expression changes. A subset of the cis-NAT genes displayed negatively correlated expression profiles as well as inverse differential expression changes under at least one of the examined developmental stages or treatment conditions. By searching the <it>Arabidopsis </it>Small RNA Project (ASRP) and Massively Parallel Signature Sequencing (MPSS) small RNA databases as well as our stress-treated small RNA dataset, we found small RNAs that matched at least one gene in 646 pairs out of 1008 (64%) protein-coding cis-NAT pairs, which suggests that siRNAs may regulate the expression of many cis-NAT genes. 209 putative siRNAs have the potential to target more than one gene and half of these small RNAs could target multiple members of a gene family. Furthermore, the majority of the putative siRNAs within the overlapping regions tend to target only one transcript of a given NAT pair, which is consistent with our previous finding on salt- and bacteria-induced nat-siRNAs. In addition, we found that genes encoding plastid- or mitochondrion-targeted proteins are over-represented in the Arabidopsis cis-NATs and that 19% of sense and antisense partner genes of cis-NATs share at least one common Gene Ontology term, which suggests that they encode proteins with possible functional connection. Conclusion The negatively correlated expression patterns of sense and antisense genes as well as the presence of siRNAs in many of the cis-NATs suggest that siRNA regulation of cis-NATs via the RNAi pathway is an important gene regulatory mechanism for at least a subgroup of cis-NATs in Arabidopsis.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California