Search CORE

188 research outputs found

AIDA: ab initio domain assembly server.

Author: Godzik Adam
Jaroszewski Lukasz
Li Zhanwen
Xu Dong
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

AIDA: ab initio domain assembly server, available at http://ffas.burnham.org/AIDA/ is a tool that can identify domains in multi-domain proteins and then predict their 3D structures and relative spatial arrangements. The server is free and open to all users, and there is an option for a user to provide an e-mail to get the link to result page. Domains are evolutionary conserved and often functionally independent units in proteins. Most proteins, especially eukaryotic ones, consist of multiple domains while at the same time, most experimentally determined protein structures contain only one or two domains. As a result, often structures of individual domains in multi-domain proteins can be accurately predicted, but the mutual arrangement of different domains remains unknown. To address this issue we have developed AIDA program, which combines steps of identifying individual domains, predicting (separately) their structures and assembling them into multiple domain complexes using an ab initio folding potential to describe domain-domain interactions. AIDA server not only supports the assembly of a large number of continuous domains, but also allows the assembly of domains inserted into other domains. Users can also provide distance restraints to guide the AIDA energy minimization

CiteSeerX

PubMed Central

eScholarship - University of California

PubServer: literature searches by homology.

Author: Godzik Adam
Jaroszewski Lukasz
Koska Laszlo
Sedova Mayya
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

PubServer, available at http://pubserver.burnham.org/, is a tool to automatically collect, filter and analyze publications associated with groups of homologous proteins. Protein entries in databases such as Entrez Protein database at NCBI contain information about publications associated with a given protein. The scope of these publications varies a lot: they include studies focused on biochemical functions of individual proteins, but also reports from genome sequencing projects that introduce tens of thousands of proteins. Collecting and analyzing publications related to sets of homologous proteins help in functional annotation of novel protein families and in improving annotations of well-studied protein families or individual genes. However, performing such collection and analysis manually is a tedious and time-consuming process. PubServer automatically collects identifiers of homologous proteins using PSI-Blast, retrieves literature references from corresponding database entries and filters out publications unlikely to contain useful information about individual proteins. It also prepares simple vocabulary statistics from titles, abstracts and MeSH terms to identify the most frequently occurring keywords, which may help to quickly identify common themes in these publications. The filtering criteria applied to collected publications are user-adjustable. The results of the server are presented as an interactive page that allows re-filtering and different presentations of the output

CiteSeerX

PubMed Central

eScholarship - University of California

FFAS server: novel features and applications.

Author: Cai Xiao-hui
Godzik Adam
Jaroszewski Lukasz
Li Zhanwen
Weber Christoph
Publication venue: eScholarship, University of California
Publication date: 27/06/2011
Field of study

The Fold and Function Assignment System (FFAS) server [Jaroszewski et al. (2005) FFAS03: a server for profile-profile sequence alignments. Nucleic Acids Research, 33, W284-W288] implements the algorithm for protein profile-profile alignment introduced originally in [Rychlewski et al. (2000) Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Science: a Publication of the Protein Society, 9, 232-241]. Here, we present updates, changes and novel functionality added to the server since 2005 and discuss its new applications. The sequence database used to calculate sequence profiles was enriched by adding sets of publicly available metagenomic sequences. The profile of a user's protein can now be compared with ∼20 additional profile databases, including several complete proteomes, human proteins involved in genetic diseases and a database of microbial virulence factors. A newly developed interface uses a system of tabs, allowing the user to navigate multiple results pages, and also includes novel functionality, such as a dotplot graph viewer, modeling tools, an improved 3D alignment viewer and links to the database of structural similarities. The FFAS server was also optimized for speed: running times were reduced by an order of magnitude. The FFAS server, http://ffas.godziklab.org, has no log-in requirement, albeit there is an option to register and store results in individual, password-protected directories. Source code and Linux executables for the FFAS program are available for download from the FFAS server

PubMed Central

eScholarship - University of California

The JCSG MR pipeline: optimized alignments, multiple models and parallel searches

Author: Godzik Adam
Jaroszewski Lukasz
Schwarzenbacher Robert
Publication venue: International Union of Crystallography
Publication date: 01/01/2008
Field of study

The practical limits of molecular replacement can be extended by using several specifically designed protein models based on fold-recognition methods and by exhaustive searches performed in a parallelized pipeline. Updated results from the JCSG MR pipeline, which to date has solved 33 molecular-replacement structures with less than 35% sequence identity to the closest homologue of known structure, are presented

Crossref

PubMed Central

eScholarship - University of California

Domain analysis of the tubulin cofactor system: a model for tubulin folding and dimerization

Author: Godzik Adam
Grynberg Marcin
Jaroszewski Lukasz
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: The correct folding and dimerization of tubulins, before their addition to the microtubular structure, needs a group of conserved proteins called cofactors A to E. The biochemical analysis of cofactors gave an insight to their general functions, however not much is known about the domain structure and detailed, molecular function of these proteins. RESULTS: Combining modelling and fold prediction tools, we present 3D models of all cofactors, including several previously unannotated domains of cofactors B-E. Apart from the new HEAT and Armadillo domains in cofactor D and an unusual spectrin-like domain in cofactor C, we have identified a new subfamily of ubiquitin-like domains in cofactors B and E. Together, these observations provide a reliable, molecular level model of cofactor complex. CONCLUSION: Distant homology searches allowed the identification of unknown regions of cofactors as self-reliant domains and allow us to present a detailed hypothesis of how a cofactor complex performs its function

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

PDBFlex: exploring flexibility in protein structures.

Author: Godzik Adam
Hrabe Thomas
Jaroszewski Lukasz
Li Zhanwen
Rotkiewicz Piotr
Sedova Mayya
Publication venue: eScholarship, University of California
Publication date: 28/11/2015
Field of study

The PDBFlex database, available freely and with no login requirements at http://pdbflex.org, provides information on flexibility of protein structures as revealed by the analysis of variations between depositions of different structural models of the same protein in the Protein Data Bank (PDB). PDBFlex collects information on all instances of such depositions, identifying them by a 95% sequence identity threshold, performs analysis of their structural differences and clusters them according to their structural similarities for easy analysis. The PDBFlex contains tools and viewers enabling in-depth examination of structural variability including: 2D-scaling visualization of RMSD distances between structures of the same protein, graphs of average local RMSD in the aligned structures of protein chains, graphical presentation of differences in secondary structure and observed structural disorder (unresolved residues), difference distance maps between all sets of coordinates and 3D views of individual structures and simulated transitions between different conformations, the latter displayed using JSMol visualization software

PubMed Central

eScholarship - University of California

Integrated web service for improving alignment quality based on segments comparison

Author: Godzik Adam
Jaroszewski Lukasz
Plewczynski Dariusz
Rychlewski Leszek
Ye Yuzhen
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: Defining blocks forming the global protein structure on the basis of local structural regularity is a very fruitful idea, extensively used in description, and prediction of structure from only sequence information. Over many years the secondary structure elements were used as available building blocks with great success. Specially prepared sets of possible structural motifs can be used to describe similarity between very distant, non-homologous proteins. The reason for utilizing the structural information in the description of proteins is straightforward. Structural comparison is able to detect approximately twice as many distant relationships as sequence comparison at the same error rate. RESULTS: Here we provide a new fragment library for Local Structure Segment (LSS) prediction called FRAGlib which is integrated with a previously described segment alignment algorithm SEA. A joined FRAGlib/SEA server provides easy access to both algorithms, allowing a one stop alignment service using a novel approach to protein sequence alignment based on a network matching approach. The FRAGlib used as secondary structure prediction achieves only 73% accuracy in Q3 measure, but when combined with the SEA alignment, it achieves a significant improvement in pairwise sequence alignment quality, as compared to previous SEA implementation and other public alignment algorithms. The FRAGlib algorithm takes ~2 min. to search over FRAGlib database for a typical query protein with 500 residues. The SEA service align two typical proteins within circa ~5 min. All supplementary materials (detailed results of all the benchmarks, the list of test proteins and the whole fragments library) are available for download on-line at . CONCLUSIONS: The joined FRAGlib/SEA server will be a valuable tool both for molecular biologists working on protein sequence analysis and for bioinformaticians developing computational methods of structure prediction and alignment of proteins

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FFAS03: a server for profile–profile sequence alignments

Author: Godzik Adam
Jaroszewski Lukasz
Li Weizhong
Li Zhanwen
Rychlewski Leszek
Publication venue: Oxford University Press
Publication date: 27/06/2005
Field of study

The FFAS03 server provides a web interface to the third generation of the profile–profile alignment and fold-recognition algorithm of fold and function assignment system (FFAS) [L. Rychlewski, L. Jaroszewski, W. Li and A. Godzik (2000), Protein Sci., 9, 232–241]. Profile–profile algorithms use information present in sequences of homologous proteins to amplify the patterns defining the family. As a result, they enable detection of remote homologies beyond the reach of other methods. FFAS, initially developed in 2000, is consistently one of the best ranked fold prediction methods in the CAFASP and LiveBench competitions. It is also used by several fold-recognition consensus methods and meta-servers. The FFAS03 server accepts a user supplied protein sequence and automatically generates a profile, which is then compared with several sets of sequence profiles of proteins from PDB, COG, PFAM and SCOP. The profile databases used by the server are automatically updated with the latest structural and sequence information. The server provides access to the alignment analysis, multiple alignment, and comparative modeling tools. Access to the server is open for both academic and commercial researchers. The FFAS03 server is available at

Crossref

PubMed Central

eScholarship - University of California

Two Pfam protein families characterized by a crystal structure of protein lpg2210 from Legionella pneumophila.

Author: Aravind L
Axelrod Herbert L
Bateman Alex
Chang Yuanyuan
Coggill Penelope
Das Debanu
Eberhardt Ruth Y
Finn Robert D
Godzik Adam
Jaroszewski Lukasz
Murzin Alexey G
Xu Qingping
Publication venue: eScholarship, University of California
Publication date: 01/09/2013
Field of study

BackgroundEvery genome contains a large number of uncharacterized proteins that may encode entirely novel biological systems. Many of these uncharacterized proteins fall into related sequence families. By applying sequence and structural analysis we hope to provide insight into novel biology.ResultsWe analyze a previously uncharacterized Pfam protein family called DUF4424 [Pfam:PF14415]. The recently solved three-dimensional structure of the protein lpg2210 from Legionella pneumophila provides the first structural information pertaining to this family. This protein additionally includes the first representative structure of another Pfam family called the YARHG domain [Pfam:PF13308]. The Pfam family DUF4424 adopts a 19-stranded beta-sandwich fold that shows similarity to the N-terminal domain of leukotriene A-4 hydrolase. The YARHG domain forms an all-helical domain at the C-terminus. Structure analysis allows us to recognize distant similarities between the DUF4424 domain and individual domains of M1 aminopeptidases and tricorn proteases, which form massive proteasome-like capsids in both archaea and bacteria.ConclusionsBased on our analyses we hypothesize that the DUF4424 domain may have a role in forming large, multi-component enzyme complexes. We suggest that the YARGH domain may play a role in binding a moiety in proximity with peptidoglycan, such as a hydrophobic outer membrane lipid or lipopolysaccharide

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Structure of the first representative of Pfam family PF04016 (DUF364) reveals enolase and Rossmann-like folds that combine to form a unique active site with a possible role in heavy-metal chelation.

Author: Abdubek Polat
Aravind L
Astakhova Tamara
Axelrod Herbert L
Bakolitsa Constantina
Carlton Dennis
Chiu Hsiu Ju
Clayton Thomas
Deacon Ashley M
Deller Marc C
Duan Lian
Elsliger Marc André
Feuerhelm Julie
Godzik Adam
Grant Joanna C
Han Gye Won
Hodgson Keith O
Jaroszewski Lukasz
Jin Kevin K
Klock Heath E
Knuth Mark W
Kozbial Piotr
Krishna S Sri
Kumar Abhinav
Lesley Scott A
Marciano David
McMullan Daniel
Miller Mitchell D
Morse Andrew T
Nigoghossian Edward
Okach Linda
Reyes Ron
Rife Christopher L
van den Bedem Henry
Weekes Dana
Wilson Ian A
Wooley John
Xu Qingping
Publication venue: eScholarship, University of California
Publication date: 06/07/2010
Field of study

The crystal structure of Dhaf4260 from Desulfitobacterium hafniense DCB-2 was determined by single-wavelength anomalous diffraction (SAD) to a resolution of 2.01 Å using the semi-automated high-throughput pipeline of the Joint Center for Structural Genomics (JCSG) as part of the NIGMS Protein Structure Initiative (PSI). This protein structure is the first representative of the PF04016 (DUF364) Pfam family and reveals a novel combination of two well known domains (an enolase N-terminal-like fold followed by a Rossmann-like domain). Structural and bioinformatic analyses reveal partial similarities to Rossmann-like methyltransferases, with residues from the enolase-like fold combining to form a unique active site that is likely to be involved in the condensation or hydrolysis of molecules implicated in the synthesis of flavins, pterins or other siderophores. The genome context of Dhaf4260 and homologs additionally supports a role in heavy-metal chelation

PubMed Central

eScholarship - University of California