50 research outputs found
Gravity model in the Korean highway
We investigate the traffic flows of the Korean highway system, which contains
both public and private transportation information. We find that the traffic
flow T(ij) between city i and j forms a gravity model, the metaphor of physical
gravity as described in Newton's law of gravity, P(i)P(j)/r(ij)^2, where P(i)
represents the population of city i and r(ij) the distance between cities i and
j. It is also shown that the highway network has a heavy tail even though the
road network is a rather uniform and homogeneous one. Compared to the highway
network, air and public ground transportation establish inhomogeneous systems
and have power-law behaviors.Comment: 13 page
Statistical Analysis of the Road Network of India
In this paper we study the Indian Highway Network as a complex network where
the junction points are considered as nodes, and the links are formed by an
existing connection. We explore the topological properties and community
structure of the network. We observe that the Indian Highway Network displays
small world properties and is assortative in nature. We also identify the most
important road-junctions (or cities) in the highway network based on the
betweenness centrality of the node. This could help in identifying the
potential congestion points in the network. Our study is of practical
importance and could provide a novel approach to reduce congestion and improve
the performance of the highway networ
Static and dynamic characteristics of protein contact networks
The principles underlying protein folding remains one of Nature's puzzles
with important practical consequences for Life. An approach that has gathered
momentum since the late 1990's, looks at protein hetero-polymers and their
folding process through the lens of complex network analysis. Consequently,
there is now a body of empirical studies describing topological characteristics
of protein macro-molecules through their contact networks and linking these
topological characteristics to protein folding. The present paper is primarily
a review of this rich area. But it delves deeper into certain aspects by
emphasizing short-range and long-range links, and suggests unconventional
places where "power-laws" may be lurking within protein contact networks.
Further, it considers the dynamical view of protein contact networks. This
closer scrutiny of protein contact networks raises new questions for further
research, and identifies new regularities which may be useful to parameterize a
network approach to protein folding. Preliminary experiments with such a model
confirm that the regularities we identified cannot be easily reproduced through
random effects. Indeed, the grand challenge of protein folding is to elucidate
the process(es) which not only generates the specific and diverse linkage
patterns of protein contact networks, but also reproduces the dynamic behavior
of proteins as they fold. Keywords: network analysis, protein contact networks,
protein foldingComment: Added Appendix
An approach for the identification of targets specific to bone metastasis using cancer genes interactome and gene ontology analysis
Metastasis is one of the most enigmatic aspects of cancer pathogenesis and is
a major cause of cancer-associated mortality. Secondary bone cancer (SBC) is a
complex disease caused by metastasis of tumor cells from their primary site and
is characterized by intricate interplay of molecular interactions.
Identification of targets for multifactorial diseases such as SBC, the most
frequent complication of breast and prostate cancers, is a challenge. Towards
achieving our aim of identification of targets specific to SBC, we constructed
a 'Cancer Genes Network', a representative protein interactome of cancer genes.
Using graph theoretical methods, we obtained a set of key genes that are
relevant for generic mechanisms of cancers and have a role in biological
essentiality. We also compiled a curated dataset of 391 SBC genes from
published literature which serves as a basis of ontological correlates of
secondary bone cancer. Building on these results, we implement a strategy based
on generic cancer genes, SBC genes and gene ontology enrichment method, to
obtain a set of targets that are specific to bone metastasis. Through this
study, we present an approach for probing one of the major complications in
cancers, namely, metastasis. The results on genes that play generic roles in
cancer phenotype, obtained by network analysis of 'Cancer Genes Network', have
broader implications in understanding the role of molecular regulators in
mechanisms of cancers. Specifically, our study provides a set of potential
targets that are of ontological and regulatory relevance to secondary bone
cancer.Comment: 54 pages (19 pages main text; 11 Figures; 26 pages of supplementary
information). Revised after critical reviews. Accepted for Publication in
PLoS ON
Statistical characterization of deviations from planned flight trajectories in air traffic management
Understanding the relation between planned and realized flight trajectories and the determinants of flight deviations is of great importance in air traffic management. In this paper we perform an in-depth investigation of the statistical properties of planned and realized air traffic on the German airspace during a 28 day periods, corresponding to an AIRAC cycle. We find that realized trajectories are on average shorter than planned ones and this effect is stronger during night-time than day-time. Flights are more frequently deviated close to the departure airport and at a relatively large angle-to-destination. Moreover, the probability of a deviation is higher in low traffic phases. All these evidences indicate that deviations are mostly used by controllers to give directs to flights when traffic conditions allow it. Finally we introduce a new metric, termed di-fork, which is able to characterize navigation points according to the likelihood that a deviation occurs there. Di-fork allows to identify in a statistically rigorous way navigation point pairs where deviations are more (less) frequent than expected under a null hypothesis of randomness that takes into account the heterogeneity of the navigation points. Such pairs can therefore be seen as sources of flexibility (stability) of controllers' traffic management while conjugating safety and efficiency
Computational Prediction of Heme-Binding Residues by Exploiting Residue Interaction Network
Computational identification of heme-binding residues is beneficial for predicting and designing novel heme proteins. Here we proposed a novel method for heme-binding residue prediction by exploiting topological properties of these residues in the residue interaction networks derived from three-dimensional structures. Comprehensive analysis showed that key residues located in heme-binding regions are generally associated with the nodes with higher degree, closeness and betweenness, but lower clustering coefficient in the network. HemeNet, a support vector machine (SVM) based predictor, was developed to identify heme-binding residues by combining topological features with existing sequence and structural features. The results showed that incorporation of network-based features significantly improved the prediction performance. We also compared the residue interaction networks of heme proteins before and after heme binding and found that the topological features can well characterize the heme-binding sites of apo structures as well as those of holo structures, which led to reliable performance improvement as we applied HemeNet to predicting the binding residues of proteins in the heme-free state. HemeNet web server is freely accessible at http://mleg.cse.sc.edu/hemeNet/
Defining an Essence of Structure Determining Residue Contacts in Proteins
The network of native non-covalent residue contacts determines the three-dimensional structure of a protein. However, not all contacts are of equal structural significance, and little knowledge exists about a minimal, yet sufficient, subset required to define the global features of a protein. Characterisation of this “structural essence” has remained elusive so far: no algorithmic strategy has been devised to-date that could outperform a random selection in terms of 3D reconstruction accuracy (measured as the Ca RMSD). It is not only of theoretical interest (i.e., for design of advanced statistical potentials) to identify the number and nature of essential native contacts—such a subset of spatial constraints is very useful in a number of novel experimental methods (like EPR) which rely heavily on constraint-based protein modelling. To derive accurate three-dimensional models from distance constraints, we implemented a reconstruction pipeline using distance geometry. We selected a test-set of 12 protein structures from the four major SCOP fold classes and performed our reconstruction analysis. As a reference set, series of random subsets (ranging from 10% to 90% of native contacts) are generated for each protein, and the reconstruction accuracy is computed for each subset. We have developed a rational strategy, termed “cone-peeling” that combines sequence features and network descriptors to select minimal subsets that outperform the reference sets. We present, for the first time, a rational strategy to derive a structural essence of residue contacts and provide an estimate of the size of this minimal subset. Our algorithm computes sparse subsets capable of determining the tertiary structure at approximately 4.8 Å Ca RMSD with as little as 8% of the native contacts (Ca-Ca and Cb-Cb). At the same time, a randomly chosen subset of native contacts needs about twice as many contacts to reach the same level of accuracy. This “structural essence” opens new avenues in the fields of structure prediction, empirical potentials and docking
Novel Feature for Catalytic Protein Residues Reflecting Interactions with Other Residues
Owing to their potential for systematic analysis, complex networks have been
widely used in proteomics. Representing a protein structure as a topology
network provides novel insight into understanding protein folding mechanisms,
stability and function. Here, we develop a new feature to reveal
correlations between residues using a protein structure network. In an original
attempt to quantify the effects of several key residues on catalytic residues, a
power function was used to model interactions between residues. The results
indicate that focusing on a few residues is a feasible approach to identifying
catalytic residues. The spatial environment surrounding a catalytic residue was
analyzed in a layered manner. We present evidence that correlation between
residues is related to their distance apart most environmental parameters of the
outer layer make a smaller contribution to prediction and ii catalytic residues
tend to be located near key positions in enzyme folds. Feature analysis revealed
satisfactory performance for our features, which were combined with several
conventional features in a prediction model for catalytic residues using a
comprehensive data set from the Catalytic Site Atlas. Values of 88.6 for
sensitivity and 88.4 for specificity were obtained by 10fold crossvalidation.
These results suggest that these features reveal the mutual dependence of
residues and are promising for further study of structurefunction
relationship
Predicting disease-associated substitution of a single amino acid by analyzing residue interactions
<p>Abstract</p> <p>Background</p> <p>The rapid accumulation of data on non-synonymous single nucleotide polymorphisms (nsSNPs, also called SAPs) should allow us to further our understanding of the underlying disease-associated mechanisms. Here, we use complex networks to study the role of an amino acid in both local and global structures and determine the extent to which disease-associated and polymorphic SAPs differ in terms of their interactions to other residues.</p> <p>Results</p> <p>We found that SAPs can be well characterized by network topological features. Mutations are probably disease-associated when they occur at a site with a high centrality value and/or high degree value in a protein structure network. We also discovered that study of the neighboring residues around a mutation site can help to determine whether the mutation is disease-related or not. We compiled a dataset from the Swiss-Prot variant pages and constructed a model to predict disease-associated SAPs based on the random forest algorithm. The values of total accuracy and MCC were 83.0% and 0.64, respectively, as determined by 5-fold cross-validation. With an independent dataset, our model achieved a total accuracy of 80.8% and MCC of 0.59, respectively.</p> <p>Conclusions</p> <p>The satisfactory performance suggests that network topological features can be used as quantification measures to determine the importance of a site on a protein, and this approach can complement existing methods for prediction of disease-associated SAPs. Moreover, the use of this method in SAP studies would help to determine the underlying linkage between SAPs and diseases through extensive investigation of mutual interactions between residues.</p
Topological Structure of the Space of Phenotypes: The Case of RNA Neutral Networks
The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 412 sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described