1,246 research outputs found

    Path Similarity Analysis: a Method for Quantifying Macromolecular Pathways

    Full text link
    Diverse classes of proteins function through large-scale conformational changes; sophisticated enhanced sampling methods have been proposed to generate these macromolecular transition paths. As such paths are curves in a high-dimensional space, they have been difficult to compare quantitatively, a prerequisite to, for instance, assess the quality of different sampling algorithms. The Path Similarity Analysis (PSA) approach alleviates these difficulties by utilizing the full information in 3N-dimensional trajectories in configuration space. PSA employs the Hausdorff or Fr\'echet path metrics---adopted from computational geometry---enabling us to quantify path (dis)similarity, while the new concept of a Hausdorff-pair map permits the extraction of atomic-scale determinants responsible for path differences. Combined with clustering techniques, PSA facilitates the comparison of many paths, including collections of transition ensembles. We use the closed-to-open transition of the enzyme adenylate kinase (AdK)---a commonly used testbed for the assessment enhanced sampling algorithms---to examine multiple microsecond equilibrium molecular dynamics (MD) transitions of AdK in its substrate-free form alongside transition ensembles from the MD-based dynamic importance sampling (DIMS-MD) and targeted MD (TMD) methods, and a geometrical targeting algorithm (FRODA). A Hausdorff pairs analysis of these ensembles revealed, for instance, that differences in DIMS-MD and FRODA paths were mediated by a set of conserved salt bridges whose charge-charge interactions are fully modeled in DIMS-MD but not in FRODA. We also demonstrate how existing trajectory analysis methods relying on pre-defined collective variables, such as native contacts or geometric quantities, can be used synergistically with PSA, as well as the application of PSA to more complex systems such as membrane transporter proteins.Comment: 9 figures, 3 tables in the main manuscript; supplementary information includes 7 texts (S1 Text - S7 Text) and 11 figures (S1 Fig - S11 Fig) (also available from journal site

    Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures

    Get PDF
    This paper compares several published methods for clustering chemical structures, using both graph- and fingerprint-based similarity measures. The clusterings from each method were compared to determine the degree of cluster overlap. Each method was also evaluated on how well it grouped structures into clusters possessing a non-trivial substructural commonality. The methods which employ adjustable parameters were tested to determine the stability of each parameter for datasets of varying size and composition. Our experiments suggest that both graph- and fingerprint-based similarity measures can be used effectively for generating chemical clusterings; it is also suggested that the CAST and Yin–Chen methods, suggested recently for the clustering of gene expression patterns, may also prove effective for the clustering of 2D chemical structures

    Predicting cancer drug mechanisms of action using molecular network signatures

    Get PDF
    Molecular signatures are a powerful approach to characterize novel small molecules and derivatized small molecule libraries. While new experimental techniques are being developed in diverse model systems, informatics approaches lag behind these exciting advances. We propose an analysis pipeline for signature based drug annotation. We develop an integrated strategy, utilizing supervised and unsupervised learning methodologies that are bridged by network based statistics. Using this approach we can: 1, predict new examples of drug mechanisms that we trained our model upon; 2, identify “New” mechanisms of action that do not belong to drug categories that our model was trained upon; and 3, update our training sets with these “New” mechanisms and accurately predict entirely distinct examples from these new categories. Thus, not only does our strategy provide statistical generalization but it also offers biological generalization. Additionally, we show that our approach is applicable to diverse types of data, and that distinct biological mechanisms characterize its resolution of categories across different data types. As particular examples, we find that our predictive resolution of drug mechanisms from mRNA expression studies relies upon the analog measurement of a cell stress-related transcriptional rheostat along with a transcriptional representation of cell cycle state; whereas, in contrast, drug mechanism resolution from functional RNAi studies rely upon more dichotomous (e.g., either enhances or inhibits) association with cell death states. We believe that our approach can facilitate molecular signature-based drug mechanism understanding from different technology platforms and across diverse biological phenomena.National Cancer Institute (U.S.) (NCI Integrative Cancer Biology Program grant U54-CA112967

    Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes

    Get PDF
    Complexes of physically interacting proteins constitute fundamental functional units responsible for driving biological processes within cells. A faithful reconstruction of the entire set of complexes is therefore essential to understand the functional organization of cells. In this review, we discuss the key contributions of computational methods developed till date (approximately between 2003 and 2015) for identifying complexes from the network of interacting proteins (PPI network). We evaluate in depth the performance of these methods on PPI datasets from yeast, and highlight challenges faced by these methods, in particular detection of sparse and small or sub- complexes and discerning of overlapping complexes. We describe methods for integrating diverse information including expression profiles and 3D structures of proteins with PPI networks to understand the dynamics of complex formation, for instance, of time-based assembly of complex subunits and formation of fuzzy complexes from intrinsically disordered proteins. Finally, we discuss methods for identifying dysfunctional complexes in human diseases, an application that is proving invaluable to understand disease mechanisms and to discover novel therapeutic targets. We hope this review aptly commemorates a decade of research on computational prediction of complexes and constitutes a valuable reference for further advancements in this exciting area.Comment: 1 Tabl

    PANDORA: A Parallel Dendrogram Construction Algorithm for Single Linkage Clustering on GPU

    Full text link
    This paper presents \pandora, a novel parallel algorithm for efficiently constructing dendrograms for single-linkage hierarchical clustering, including \hdbscan. Traditional dendrogram construction methods from a minimum spanning tree (MST), such as agglomerative or divisive techniques, often fail to efficiently parallelize, especially with skewed dendrograms common in real-world data. \pandora addresses these challenges through a unique recursive tree contraction method, which simplifies the tree for initial dendrogram construction and then progressively reconstructs the complete dendrogram. This process makes \pandora asymptotically work-optimal, independent of dendrogram skewness. All steps in \pandora are fully parallel and suitable for massively threaded accelerators such as GPUs. Our implementation is written in Kokkos, providing support for both CPUs and multi-vendor GPUs (e.g., Nvidia, AMD). The multithreaded version of \pandora is 2.2Ă—\times faster than the current best-multithreaded implementation, while the GPU \pandora implementation achieved 6-20Ă—\times on \amdgpu and 10-37Ă—\times on \nvidiagpu speed-up over multithreaded \pandora. These advancements lead to up to a 6-fold speedup for \hdbscan on GPUs over the current best, which only offload MST construction to GPUs and perform multithreaded dendrogram construction

    Hierarchical Portfolio Optimization

    Get PDF
    The field of Portfolio Optimization has historically had a very hard time as the Mathematical Models at its availability are based on certain assumptions one can not afford to make in the financial markets, making naive approaches all-too enticing. In this project we have introduced the assumption that the different stocks in the financial markets have a hierarchical structure and have allowed ourselves to be inspired by it to build portfolios through a Machine Learning approach. We have employed the Hierarchical Risk Parity algorithm and tested minor variations relating to the dissimilarity measure it makes use of. The tests were conducted with historical daily closing price data from 2014 to 2020 for 440 stocks in the S&P 500 index. Results suggest most of the tested Hierarchical Risk Parity variants are robust and can compete with the Equal Weights Portfolio. We mainly encourage the use of two dissimilarity measures, the standard one, a correlation based metric and Dynamic Time Warping. The former is suggested to the pessimistic investor while the latter to the hopeful yet conservative investor. To optimistic investors with a high risk tolerance the recommendation would be to use the traditional Equal Weights portfolio among the asset allocation methods considered in this project

    Hierarchical Portfolio Optimization

    Get PDF
    The field of Portfolio Optimization has historically had a very hard time as the Mathematical Models at its availability are based on certain assumptions one can not afford to make in the financial markets, making naive approaches all-too enticing. In this project we have introduced the assumption that the different stocks in the financial markets have a hierarchical structure and have allowed ourselves to be inspired by it to build portfolios through a Machine Learning approach. We have employed the Hierarchical Risk Parity algorithm and tested minor variations relating to the dissimilarity measure it makes use of. The tests were conducted with historical daily closing price data from 2014 to 2020 for 440 stocks in the S&P 500 index. Results suggest most of the tested Hierarchical Risk Parity variants are robust and can compete with the Equal Weights Portfolio. We mainly encourage the use of two dissimilarity measures, the standard one, a correlation based metric and Dynamic Time Warping. The former is suggested to the pessimistic investor while the latter to the hopeful yet conservative investor. To optimistic investors with a high risk tolerance the recommendation would be to use the traditional Equal Weights portfolio among the asset allocation methods considered in this project
    • …
    corecore