106 research outputs found
Fast diffusion of a Lennard-Jones cluster on a crystalline surface
We present a Molecular Dynamics study of large Lennard-Jones clusters
evolving on a crystalline surface. The static and the dynamic properties of the
cluster are described. We find that large clusters can diffuse rapidly, as
experimentally observed. The role of the mismatch between the lattice
parameters of the cluster and the substrate is emphasized to explain the
diffusion of the cluster. This diffusion can be described as a Brownian motion
induced by the vibrationnal coupling to the substrate, a mechanism that has not
been previously considered for cluster diffusion.Comment: latex, 5 pages with figure
tRNA methylation resolves codon usage bias at the limit of cell viability.
Codon usage of each genome is closely correlated with the abundance of tRNA isoacceptors. How codon usage bias is resolved by tRNA post-transcriptional modifications is largely unknown. Here we demonstrate that the N1-methylation of guanosine at position 37 (m1G37) on the 3'-side of the anticodon, while not directly responsible for reading of codons, is a neutralizer that resolves differential decoding of proline codons. A genome-wide suppressor screen of a non-viable Escherichia coli strain, lacking m1G37, identifies proS suppressor mutations, indicating a coupling of methylation with tRNA prolyl-aminoacylation that sets the limit of cell viability. Using these suppressors, where prolyl-aminoacylation is decoupled from tRNA methylation, we show that m1G37 neutralizes differential translation of proline codons by the major isoacceptor. Lack of m1G37 inactivates this neutralization and exposes the need for a minor isoacceptor for cell viability. This work has medical implications for bacterial species that exclusively use the major isoacceptor for survival
Dynamic Proteomics of Individual Cancer Cells in Response to a Drug
Why do seemingly identical cells respond differently to a drug? To address this, we studied the dynamics and variability of the protein response of human cancer cells to a chemotherapy drug, camptothecin. We present a dynamic-proteomics approach that measures the levels and locations of nearly 1000 different endogenously tagged proteins in individual living cells at high temporal resolution. All cells show rapid translocation of proteins specific to the drug mechanism, including the drug target (topoisomerase-1), and slower, wide-ranging temporal waves of protein degradation and accumulation. However, the cells differ in the behavior of a subset of proteins. We identify proteins whose dynamics differ widely between cells, in a way that corresponds to the outcomes—cell death or survival. This opens the way to understanding molecular responses to drugs in individual cells
COMPASS server for homology detection: improved statistical accuracy, speed and functionality
COMPASS is a profile-based method for the detection of remote sequence similarity and the prediction of protein structure. Here we describe a recently improved public web server of COMPASS, http://prodata.swmed.edu/compass. The server features three major developments: (i) improved statistical accuracy; (ii) increased speed from parallel implementation; and (iii) new functional features facilitating structure prediction. These features include visualization tools that allow the user to quickly and effectively analyze specific local structural region predictions suggested by COMPASS alignments. As an application example, we describe the structural, evolutionary and functional analysis of a protein with unknown function that served as a target in the recent CASP8 (Critical Assessment of Techniques for Protein Structure Prediction round 8). URL: http://prodata.swmed.edu/compas
Considering scores between unrelated proteins in the search database improves profile comparison
<p>Abstract</p> <p>Background</p> <p>Profile-based comparison of multiple sequence alignments is a powerful methodology for the detection remote protein sequence similarity, which is essential for the inference and analysis of protein structure, function, and evolution. Accurate estimation of statistical significance of detected profile similarities is essential for further development of this methodology. Here we analyze a novel approach to estimate the statistical significance of profile similarity: the explicit consideration of background score distributions for each database template (subject).</p> <p>Results</p> <p>Using a simple scheme to combine and analytically approximate query- and subject-based distributions, we show that (i) inclusion of background distributions for the subjects increases the quality of homology detection; (ii) this increase is higher when the distributions are based on the scores to all known non-homologs of the subject rather than a small calibration subset of the database representatives; and (iii) these all known non-homolog distributions of scores for the subject make the dominant contribution to the improved performance: adding the calibration distribution of the query has a negligible additional effect.</p> <p>Conclusion</p> <p>The construction of distributions based on the complete sets of non-homologs for each subject is particularly relevant in the setting of structure prediction where the database consists of proteins with solved 3D structure (PDB, SCOP, CATH, etc.) and therefore structural relationships between proteins are known. These results point to a potential new direction in the development of more powerful methods for remote homology detection.</p
Finding regulatory elements and regulatory motifs: a general probabilistic framework
Over the last two decades a large number of algorithms has been developed for regulatory motif finding. Here we show how many of these algorithms, especially those that model binding specificities of regulatory factors with position specific weight matrices (WMs), naturally arise within a general Bayesian probabilistic framework. We discuss how WMs are constructed from sets of regulatory sites, how sites for a given WM can be discovered by scanning of large sequences, how to cluster WMs, and more generally how to cluster large sets of sites from different WMs into clusters. We discuss how 'regulatory modules', clusters of sites for subsets of WMs, can be found in large intergenic sequences, and we discuss different methods for ab initio motif finding, including expectation maximization (EM) algorithms, and motif sampling algorithms. Finally, we extensively discuss how module finding methods and ab initio motif finding methods can be extended to take phylogenetic relations between the input sequences into account, i.e. we show how motif finding and phylogenetic footprinting can be integrated in a rigorous probabilistic framework. The article is intended for readers with a solid background in applied mathematics, and preferably with some knowledge of general Bayesian probabilistic methods. The main purpose of the article is to elucidate that all these methods are not a disconnected set of individual algorithmic recipes, but that they are just different facets of a single integrated probabilistic theory
Common Peptides Study of Aminoacyl-tRNA Synthetases
Aminoacyl tRNA synthetases (aaRSs) constitute an essential enzyme super-family, providing fidelity of the translation process of mRNA to proteins in living cells. They are common to all kingdoms and are of utmost importance to all organisms. It is thus of great interest to understand the evolutionary relationships among them and underline signature motifs defining their common domains.We utilized the Common Peptides (CPs) framework, based on extracted deterministic motifs from all aaRSs, to study family-specific properties. We identified novel aaRS–class related signatures that may supplement the current classification methods and provide a basis for identifying functional regions specific to each aaRS class. We exploited the space spanned by the CPs in order to identify similarities between aaRS families that are not observed using sequence alignment methods, identifying different inter-aaRS associations across different kingdom of life. We explored the evolutionary history of the aaRS families and evolutionary origins of the mitochondrial aaRSs. Lastly, we showed that prevalent CPs significantly overlap known catalytic and binding sites, suggesting that they have meaningful functional roles, as well as identifying a motif shared between aaRSs and a the Biotin-[acetyl-CoA carboxylase] synthetase (birA) enzyme overlapping binding sites in both families.The study presents the multitude of ways to exploit the CP framework in order to extract meaningful patterns from the aaRS super-family. Specific CPs, discovered in this study, may play important roles in the functionality of these enzymes. We explored the evolutionary patterns in each aaRS family and tracked remote evolutionary links between these families
MACSIMS : multiple alignment of complete sequences information management system
BACKGROUND: In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. RESULTS: MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. CONCLUSION: MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at
Island method for estimating the statistical significance of profile-profile alignment scores
<p>Abstract</p> <p>Background</p> <p>In the last decade, a significant improvement in detecting remote similarity between protein sequences has been made by utilizing alignment profiles in place of amino-acid strings. Unfortunately, no analytical theory is available for estimating the significance of a gapped alignment of two profiles. Many experiments suggest that the distribution of local profile-profile alignment scores is of the Gumbel form. However, estimating distribution parameters by random simulations turns out to be computationally very expensive.</p> <p>Results</p> <p>We demonstrate that the background distribution of profile-profile alignment scores heavily depends on profiles' composition and thus the distribution parameters must be estimated independently, for each pair of profiles of interest. We also show that accurate estimates of statistical parameters can be obtained using the "island statistics" for profile-profile alignments.</p> <p>Conclusion</p> <p>The island statistics can be generalized to profile-profile alignments to provide an efficient method for the alignment score normalization. Since multiple island scores can be extracted from a single comparison of two profiles, the island method has a clear speed advantage over the direct shuffling method for comparable accuracy in parameter estimates.</p
Identifying Biological Network Structure, Predicting Network Behavior, and Classifying Network State With High Dimensional Model Representation (HDMR)
This work presents an adapted Random Sampling - High Dimensional Model Representation (RS-HDMR) algorithm for synergistically addressing three key problems in network biology: (1) identifying the structure of biological networks from multivariate data, (2) predicting network response under previously unsampled conditions, and (3) inferring experimental perturbations based on the observed network state. RS-HDMR is a multivariate regression method that decomposes network interactions into a hierarchy of non-linear component functions. Sensitivity analysis based on these functions provides a clear physical and statistical interpretation of the underlying network structure. The advantages of RS-HDMR include efficient extraction of nonlinear and cooperative network relationships without resorting to discretization, prediction of network behavior without mechanistic modeling, robustness to data noise, and favorable scalability of the sampling requirement with respect to network size. As a proof-of-principle study, RS-HDMR was applied to experimental data measuring the single-cell response of a protein-protein signaling network to various experimental perturbations. A comparison to network structure identified in the literature and through other inference methods, including Bayesian and mutual-information based algorithms, suggests that RS-HDMR can successfully reveal a network structure with a low false positive rate while still capturing non-linear and cooperative interactions. RS-HDMR identified several higher-order network interactions that correspond to known feedback regulations among multiple network species and that were unidentified by other network inference methods. Furthermore, RS-HDMR has a better ability to predict network response under unsampled conditions in this application than the best statistical inference algorithm presented in the recent DREAM3 signaling-prediction competition. RS-HDMR can discern and predict differences in network state that arise from sources ranging from intrinsic cell-cell variability to altered experimental conditions, such as when drug perturbations are introduced. This ability ultimately allows RS-HDMR to accurately classify the experimental conditions of a given sample based on its observed network state
- …