29 research outputs found
Pairwise alignment between query sequence <i>O67940_ AQUAE</i> and 2Q6O (top) and 1RQP (bottom).
<p>(Top) Query aligns end-to-end without any long gaps with a sequence identity of 32%. (Bottom) Query aligns end-to-end but with three regions of gaps, the most significant being a 23-residue region in 1RQP residues 92–116. The sequence identity of query with 1RQP is 26%.</p
SCOP output.
<p>1RQP is used since our query protein O67940 from <i>Aquifex aeolicus</i> does not have a solved structure. The results indicate that the N-terminal and C-terminal domains of 1RQP belong to two SCOP superfamilies. (The SCOP database provides a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known).</p
Structure-guided alignment constructed with homologous sequences using Cn3D (top) and neighbor-joining tree based on the score of aligned residues from homologous sequences using CDTree (bottom).
<p>Structure-guided alignment constructed with homologous sequences using Cn3D (top) and neighbor-joining tree based on the score of aligned residues from homologous sequences using CDTree (bottom).</p
Ligplot for 1RQP.
<p>SAM-binding residues. Dashed green lines indicate hydrogen bonds, and the half-moon indicates van der Waals interactions. (Ligplot is a program for automatically plotting protein–ligand interactions provided as part of the PDBsum database, which is a Web-based database of summaries and analyses of all PDB structures).</p
Percent-identity scale.
<p>The horizontal line gives the percent identity between query and subject sequences, and the boxes gives the resources and tools that can be used for functional inference.</p
Ten-step procedure for comparative analysis of protein structures and sequences to infer biological function.
<p>Ten-step procedure for comparative analysis of protein structures and sequences to infer biological function.</p
PSI-BLAST input panel (top) and PSI-BLAST output iteration (bottom).
<p>(Top) Default parameters are used. The fasta sequence of query protein with UniProt accession O67940 from <i>Aquifex aeolicus</i> is blasted against NCBI's nr database. (Bottom) The query protein <i>O67940_ AQUAE</i> hits several structures (tagged with S in a red box). Only two of the non-redundant structures with PDB-ids 2Q6O and 1RQP (marked by a pink box) are functionally characterized with e-values 3e-20 and 3e-17 and percent identities of 32% and 26%, respectively. (The Expect value (E) or an e-value is a parameter that describes the number of hits one can “expect” to see by chance when searching a database of a particular size. It decreases exponentially as the Score (S) of the match increases.)</p
PIRSF (A,B), COG (C,D), and Pfam (E,F) input and results.
<p>(A) The fasta sequence of query protein with UniProt accession O67940 from <i>Aquifex aeolicus</i> is scanned against PIR's curated family database. (The query is searched against the full-length and domain hidden Markov models for manually curated PIRSFs. If a match is found, the matched regions and statistics are displayed). (B) The query hits the PIRSF family PIRSF006779. The output provides family details; statistical data for full-length proteins, composite domains, and a pairwise alignment of query with the consensus sequence of the PIRSF. (C) The fasta sequence of query protein with UniProt accession O67940 from <i>Aquifex aeolicus</i> is scanned against the database of clusters of orthologous groups. COG compares protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of orthologous/co-orthologous proteins from at least three lineages. (D) The query hits COG1912. The output provides the family details: statistical score, reciprocal best hits, and members of the family. (E) The fasta sequence of query protein with UniProt accession O67940 from <i>Aquifex aeolicus</i> is scanned against the Pfam domain database. The Pfam database is a large collection of domain families, each represented by multiple sequence alignments and hidden Markov models (HMMs). (F) The query hits Pfam family PF01887.</p
IPA results for functional analysis of the protein data set.
<p>The threshold for all the analyses was set to p<0.05 and the data is plotted against –log of p values. The top three categories with lowest p value are presented for three IPA categories: <b>A</b>. molecular and cellular functions; CTCSI -Cell to Cell Signaling and Interaction, CM- Cellular Movement, CFM- Cellular Function and Maintenance. <b>B</b>. physiological system development and functions; TD- Tissue Development, OD- Organismal Development HSDF- Hematological System Development and Function. <b>C</b>. canonical pathways; CS- Coagulation System, IPAP- Intrinsic Prothrombin Activation Pathway, CES- Caveolar mediated Endocytosis Signaling.</p