1,604 research outputs found
Distance, dissimilarity index, and network community structure
We address the question of finding the community structure of a complex
network. In an earlier effort [H. Zhou, {\em Phys. Rev. E} (2003)], the concept
of network random walking is introduced and a distance measure defined. Here we
calculate, based on this distance measure, the dissimilarity index between
nearest-neighboring vertices of a network and design an algorithm to partition
these vertices into communities that are hierarchically organized. Each
community is characterized by an upper and a lower dissimilarity threshold. The
algorithm is applied to several artificial and real-world networks, and
excellent results are obtained. In the case of artificially generated random
modular networks, this method outperforms the algorithm based on the concept of
edge betweenness centrality. For yeast's protein-protein interaction network,
we are able to identify many clusters that have well defined biological
functions.Comment: 10 pages, 7 figures, REVTeX4 forma
Network Landscape from a Brownian Particle's Perspective
Given a complex biological or social network, how many clusters should it be
decomposed into? We define the distance from node to node as
the average number of steps a Brownian particle takes to reach from .
Node is a global attractor of if for any of
the graph; it is a local attractor of , if (the set of
nearest-neighbors of ) and for any . Based
on the intuition that each node should have a high probability to be in the
same community as its global (local) attractor on the global (local) scale, we
present a simple method to uncover a network's community structure. This method
is applied to several real networks and some discussion on its possible
extensions is made.Comment: 5 pages, 4 color-figures. REVTeX 4 format. To appear in PR
Identification of Amino Acid Sequences with Good Folding Properties in an Off-Lattice Model
Folding properties of a two-dimensional toy protein model containing only two
amino-acid types, hydrophobic and hydrophilic, respectively, are analyzed. An
efficient Monte Carlo procedure is employed to ensure that the ground states
are found. The thermodynamic properties are found to be strongly sequence
dependent in contrast to the kinetic ones. Hence, criteria for good folders are
defined entirely in terms of thermodynamic fluctuations. With these criteria
sequence patterns that fold well are isolated. For 300 chains with 20 randomly
chosen binary residues approximately 10% meet these criteria. Also, an analysis
is performed by means of statistical and artificial neural network methods from
which it is concluded that the folding properties can be predicted to a certain
degree given the binary numbers characterizing the sequences.Comment: 15 pages, 8 Postscript figures. Minor change
The 20 years of PROSITE
PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods. The latest version of PROSITE (release 20.19 of 11 September 2007) contains 1319 patterns, 745 profiles and 764 ProRules. Over the past 2 years, about 200 domains have been added, and now 53% of UniProtKB/Swiss-Prot entries (release 54.2 of 11 September 2007) have a PROSITE match. PROSITE is available on the web at: http://www.expasy.org/prosit
Protein folding using contact maps
We present the development of the idea to use dynamics in the space of
contact maps as a computational approach to the protein folding problem. We
first introduce two important technical ingredients, the reconstruction of a
three dimensional conformation from a contact map and the Monte Carlo dynamics
in contact map space. We then discuss two approximations to the free energy of
the contact maps and a method to derive energy parameters based on perceptron
learning. Finally we present results, first for predictions based on threading
and then for energy minimization of crambin and of a set of 6 immunoglobulins.
The main result is that we proved that the two simple approximations we studied
for the free energy are not suitable for protein folding. Perspectives are
discussed in the last section.Comment: 29 pages, 10 figure
ProRule: a new database containing functional and structural information on PROSITE profiles
Motivation: Increase the discriminatory power of PROSITE profiles to facilitate function determination and provide biologically relevant information about domains detected by profiles for the annotation of proteins. Summary: We have created a new database, ProRule, which contains additional information about PROSITE profiles. ProRule contains notably the position of structurally and/or functionally critical amino acids, as well as the condition they must fulfill to play their biological role. These supplementary data should help function determination and annotation of the UniProt Swiss-Prot knowledgebase. ProRule also contains information about the domain detected by the profile in the Swiss-Prot line format. Hence, ProRule can be used to make Swiss-Prot annotation more homogeneous and consistent. The format of ProRule can be extended to provide information about combination of domains. Availability: ProRule can be accessed through ScanProsite at http://www.expasy.org/tools/scanprosite. A file containing the rules will be made available under the PROSITE copyright conditions on our ftp site (ftp://www.expasy.org/databases/prosite/) by the next PROSITE release. Contact: [email protected]
ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins
ScanProsite—http://www.expasy.org/tools/scanprosite/—is a new and improved version of the web-based tool for detecting PROSITE signature matches in protein sequences. For a number of PROSITE profiles, the tool now makes use of ProRules—context-dependent annotation templates—to detect functional and structural intra-domain residues. The detection of those features enhances the power of function prediction based on profiles. Both user-defined sequences and sequences from the UniProt Knowledgebase can be matched against custom patterns, or against PROSITE signatures. To improve response times, matches of sequences from UniProtKB against PROSITE signatures are now retrieved from a pre-computed match database. Several output modes are available including simple text views and a rich mode providing an interactive match and feature viewer with a graphical representation of result
Recent improvements to the PROSITE database
The PROSITE database consists of a large collection of biologically meaningful signatures that are described as patterns or profiles. Each signature is linked to documentation that provides useful biological information on the protein family, domain or functional site identified by the signature. The PROSITE web page has been redesigned and several tools have been implemented to help the user discover new conserved regions in their own proteins and to visualize domain arrangements. We also introduced the facility to search PDB with a PROSITE entry or a user's pattern and visualize matched positions on 3D structures. The latest version of PROSITE (release 18.17 of November 30, 2003) contains 1676 entries. The database is accessible at http://www.expasy.org/prosit
Composite structural motifs of binding sites for delineating biological functions of proteins
Most biological processes are described as a series of interactions between
proteins and other molecules, and interactions are in turn described in terms
of atomic structures. To annotate protein functions as sets of interaction
states at atomic resolution, and thereby to better understand the relation
between protein interactions and biological functions, we conducted exhaustive
all-against-all atomic structure comparisons of all known binding sites for
ligands including small molecules, proteins and nucleic acids, and identified
recurring elementary motifs. By integrating the elementary motifs associated
with each subunit, we defined composite motifs which represent
context-dependent combinations of elementary motifs. It is demonstrated that
function similarity can be better inferred from composite motif similarity
compared to the similarity of protein sequences or of individual binding sites.
By integrating the composite motifs associated with each protein function, we
define meta-composite motifs each of which is regarded as a time-independent
diagrammatic representation of a biological process. It is shown that
meta-composite motifs provide richer annotations of biological processes than
sequence clusters. The present results serve as a basis for bridging atomic
structures to higher-order biological phenomena by classification and
integration of binding site structures.Comment: 34 pages, 7 figure
- …
