32 research outputs found

    Protein Block Expert (PBE): a web-based protein structure analysis server using a structural alphabet

    Get PDF
    Encoding protein 3D structures into 1D string using short structural prototypes or structural alphabets opens a new front for structure comparison and analysis. Using the well-documented 16 motifs of Protein Blocks (PBs) as structural alphabet, we have developed a methodology to compare protein structures that are encoded as sequences of PBs by aligning them using dynamic programming which uses a substitution matrix for PBs. This methodology is implemented in the applications available in Protein Block Expert (PBE) server. PBE addresses common issues in the field of protein structure analysis such as comparison of proteins structures and identification of protein structures in structural databanks that resemble a given structure. PBE-T provides facility to transform any PDB file into sequences of PBs. PBE-ALIGNc performs comparison of two protein structures based on the alignment of their corresponding PB sequences. PBE-ALIGNm is a facility for mining SCOP database for similar structures based on the alignment of PBs. Besides, PBE provides an interface to a database (PBE-SAdb) of preprocessed PB sequences from SCOP culled at 95% and of all-against-all pairwise PB alignments at family and superfamily levels. PBE server is freely available at

    Systematic search for putative new domain families in Mycoplasma gallisepticum genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein domains are the fundamental units of protein structure, function and evolution. The delineation of different domains in proteins is important for classification, understanding of structure, function and evolution. The delineation of protein domains within a polypeptide chain, namely at the genome scale, can be achieved in several ways but may remain problematic in many instances. Difficulties in identifying the domain content of a given sequence arise when the query sequence has no homologues with experimentally determined structure and searching against sequence domain databases also results in insignificant matches. Identification of domains under low sequence identity conditions and lack of structural homologues acquire a crucial importance especially at the genomic scale.</p> <p>Findings</p> <p>We have developed a new method for the identification of domains in unassigned regions through indirect connections and scaled up its application to the analysis of 434 unassigned regions in 726 protein sequences of <it>Mycoplasma gallisepticum </it>genome. We could establish 71 new domain relationships and probable 63 putative new domain families through intermediate sequences in the unassigned regions, which importantly represent an overall 10% increase in PfamA domain annotation over the direct assignment in this genome.</p> <p>Conclusions</p> <p>The systematic analysis of the unassigned regions in the <it>Mycoplasma gallisepticum </it>genome has provided some insight into the possible new domain relationships and putative new domain families. Further investigation of these predicted new domains may prove beneficial in improving the existing domain prediction algorithms.</p

    Protein structure search and local structure characterization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Structural similarities among proteins can provide valuable insight into their functional mechanisms and relationships. As the number of available three-dimensional (3D) protein structures increases, a greater variety of studies can be conducted with increasing efficiency, among which is the design of protein structural alphabets. Structural alphabets allow us to characterize local structures of proteins and describe the global folding structure of a protein using a one-dimensional (1D) sequence. Thus, 1D sequences can be used to identify structural similarities among proteins using standard sequence alignment tools such as BLAST or FASTA.</p> <p>Results</p> <p>We used self-organizing maps in combination with a minimum spanning tree algorithm to determine the optimum size of a structural alphabet and applied the k-means algorithm to group protein fragnts into clusters. The centroids of these clusters defined the structural alphabet. We also developed a flexible matrix training system to build a substitution matrix (TRISUM-169) for our alphabet. Based on FASTA and using TRISUM-169 as the substitution matrix, we developed the SA-FAST alignment tool. We compared the performance of SA-FAST with that of various search tools in database-scale search tasks and found that SA-FAST was highly competitive in all tests conducted. Further, we evaluated the performance of our structural alphabet in recognizing specific structural domains of EGF and EGF-like proteins. Our method successfully recovered more EGF sub-domains using our structural alphabet than when using other structural alphabets. SA-FAST can be found at <url>http://140.113.166.178/safast/</url>.</p> <p>Conclusion</p> <p>The goal of this project was two-fold. First, we wanted to introduce a modular design pipeline to those who have been working with structural alphabets. Secondly, we wanted to open the door to researchers who have done substantial work in biological sequences but have yet to enter the field of protein structure research. Our experiments showed that by transforming the structural representations from 3D to 1D, several 1D-based tools can be applied to structural analysis, including similarity searches and structural motif finding.</p

    Conserved Molecular Underpinnings and Characterization of a Role for Caveolin-1 in the Tumor Microenvironment of Mature T-Cell Lymphomas

    Get PDF
    Neoplasms of extra-thymic T-cell origin represent a rare and difficult population characterized by poor clinical outcome, aggressive presentation, and poorly defined molecular characteristics. Much work has been done to gain greater insights into distinguishing features among malignant subtypes, but there also exists a need to identify unifying characteristics to assist in rapid diagnosis and subsequent potential treatment. Herein, we investigated gene expression data of five different mature T-cell lymphoma subtypes (n = 187) and found 21 genes to be up- and down-regulated across all malignancies in comparison to healthy CD4+ and CD8+ T-cell controls (n = 52). From these results, we sought to characterize a role for caveolin-1 (CAV1), a gene with previous description in the progression of both solid and hematological tumors. Caveolin-1 was upregulated, albeit with a heterogeneous nature, across all mature T-cell lymphoma subtypes, a finding confirmed using immunohistochemical staining on an independent sampling of mature T-cell lymphoma biopsies (n = 65 cases). Further, stratifying malignant samples in accordance with high and low CAV1 expression revealed that higher expression of CAV1 in mature T-cell lymphomas is analogous with an enhanced inflammatory and invasive gene expression profile. Taken together, these results demonstrate a role for CAV1 in the tumor microenvironment of mature T-cell malignancies and point toward potential prognostic implications

    Assignment of PolyProline II Conformation and Analysis of Sequence – Structure Relationship

    Get PDF
    International audienceBACKGROUND: Secondary structures are elements of great importance in structural biology, biochemistry and bioinformatics. They are broadly composed of two repetitive structures namely α-helices and β-sheets, apart from turns, and the rest is associated to coil. These repetitive secondary structures have specific and conserved biophysical and geometric properties. PolyProline II (PPII) helix is yet another interesting repetitive structure which is less frequent and not usually associated with stabilizing interactions. Recent studies have shown that PPII frequency is higher than expected, and they could have an important role in protein - protein interactions. METHODOLOGY/PRINCIPAL FINDINGS: A major factor that limits the study of PPII is that its assignment cannot be carried out with the most commonly used secondary structure assignment methods (SSAMs). The purpose of this work is to propose a PPII assignment methodology that can be defined in the frame of DSSP secondary structure assignment. Considering the ambiguity in PPII assignments by different methods, a consensus assignment strategy was utilized. To define the most consensual rule of PPII assignment, three SSAMs that can assign PPII, were compared and analyzed. The assignment rule was defined to have a maximum coverage of all assignments made by these SSAMs. Not many constraints were added to the assignment and only PPII helices of at least 2 residues length are defined. CONCLUSIONS/SIGNIFICANCE: The simple rules designed in this study for characterizing PPII conformation, lead to the assignment of 5% of all amino as PPII. Sequence - structure relationships associated with PPII, defined by the different SSAMs, underline few striking differences. A specific study of amino acid preferences in their N and C-cap regions was carried out as their solvent accessibility and contact patterns. Thus the assignment of PPII can be coupled with DSSP and thus opens a simple way for further analysis in this field

    Studies of a murine monoclonal antibody directed against DARC: reappraisal of its specificity.

    Get PDF
    Duffy Antigen Receptor for Chemokines (DARC) plays multiple roles in human health as a blood group antigen, a receptor for chemokines and the only known receptor for Plasmodium vivax merozoites. It is the target of the murine anti-Fy6 monoclonal antibody 2C3 which binds to the first extracellular domain (ECD1), but exact nature of the recognized epitope was a subject of contradictory reports. Here, using a set of complex experiments which include expression of DARC with amino acid substitutions within the Fy6 epitope in E. coli and K562 cells, ELISA, surface plasmon resonance (SPR) and flow cytometry, we have resolved discrepancies between previously published reports and show that the basic epitope recognized by 2C3 antibody is 22FEDVW26, with 22F and 26W being the most important residues. In addition, we demonstrated that 30Y plays an auxiliary role in binding, particularly when the residue is sulfated. The STD-NMR studies performed using 2C3-derived Fab and synthetic peptide corroborated most of these results, and together with the molecular modelling suggested that 25V is not involved in direct interactions with the antibody, but determines folding of the epitope backbone

    Protein structure mining using a structural alphabet

    No full text
    We present a comprehensive evaluation of a new structure mining method called PB-ALIGN. It is based on the encoding of protein structure as 1D sequence of a combination of 16 short structural motifs or protein blocks (PBs). PBs are short motifs capable of representing most of the local structural features of a protein backbone. Using derived PB substitution matrix and simple dynamic programming algorithm, PB sequences are aligned the same way amino acid sequences to yield structure alignment. PBs are short motifs capable of representing most of the local structural features of a protein backbone. Alignment of these local features as sequence of symbols enables fast detection of structural similarities between two proteins. Ability of the method to characterize and align regions beyond regular secondary structures, for example, N and C caps of helix and loops connecting regular structures, puts it a step ahead of existing methods, which strongly rely on secondary structure elements. PB-ALIGN achieved efficiency of 85% in extracting true fold from a large database of 7259 SCOP domains and was successful in 82% cases to identify true super-family members. On comparison to 13 existing structure comparison/mining methods, PB-ALIGN emerged as the best on general ability test dataset and was at par with methods like YAKUSA and CE on nontrivial test dataset. Furthermore, the proposed method performed well when compared to flexible structure alignment method like FATCAT and outperforms in processing speed (less than 45 s per database scan). This work also establishes a reliable cut-off value for the demarcation of similar folds. It finally shows that global alignment scores of unrelated structures using PBs follow an extreme value distribution

    Assessment of the C4 phosphoenolpyruvate carboxylase gene diversity in grasses (Poaceae)

    No full text
    corecore