235 research outputs found

    Application and Development of Computational Methods for Ligand-Based Virtual Screening

    Get PDF
    The detection of novel active compounds that are able to modulate the biological function of a target is the primary goal of drug discovery. Different screening methods are available to identify hit compounds having the desired bioactivity in a large collection of molecules. As a computational method, virtual screening (VS) is used to search compound libraries in silico and identify those compounds that are likely to exhibit a specific activity. Ligand-based virtual screening (LBVS) is a subdiscipline that uses the information of one or more known active compounds in order to identify new hit compounds. Different LBVS methods exist, e.g. similarity searching and support vector machines (SVMs). In order to enable the application of these computational approaches, compounds have to be described numerically. Fingerprints derived from the two-dimensional compound structure, called 2D fingerprints, are among the most popular molecular descriptors available. This thesis covers the usage of 2D fingerprints in the context of LBVS. The first part focuses on a detailed analysis of 2D fingerprints. Their performance range against a wide range of pharmaceutical targets is globally estimated through fingerprint-based similarity searching. Additionally, mechanisms by which fingerprints are capable of detecting structurally diverse active compounds are identified. For this purpose, two different feature selection methods are applied to find those fingerprint features that are most relevant for the active compounds and distinguish them from other compounds. Then, 2D fingerprints are used in SVM calculations. The SVM methodology provides several opportunities to include additional information about the compounds in order to direct LBVS search calculations. In a first step, a variant of the SVM approach is applied to the multi-class prediction problem involving compounds that are active against several related targets. SVM linear combination is used to recover compounds with desired activity profiles and deprioritize compounds with other activities. Then, the SVM methodology is adopted for potency-directed VS. Compound potency is incorporated into the SVM approach through potencyoriented SVM linear combination and kernel function design to direct search calculations to the preferential detection of potent hit compounds. Next, SVM calculations are applied to address an intrinsic limitation of similarity-based methods, i.e., the presence of similar compounds having large differences in their potency. An especially designed SVM approach is introduced to predict compound pairs forming such activity cliffs. Finally, the impact of different training sets on the recall performance of SVM-based VS is analyzed and caveats are identified

    Data-Driven Rational Drug Design

    Get PDF
    Vast amount of experimental data in structural biology has been generated, collected and accumulated in the last few decades. This rich dataset is an invaluable mine of knowledge, from which deep insights can be obtained and practical applications can be developed. To achieve that goal, we must be able to manage such Big Data\u27\u27 in science and investigate them expertly. Molecular docking is a field that can prominently make use of the large structural biology dataset. As an important component of rational drug design, molecular docking is used to perform large-scale screening of putative associations between small organic molecules and their pharmacologically relevant protein targets. Given a small molecule (ligand), a molecular docking program simulates its interaction with the target protein, and reports the probable conformation of the protein-ligand complex, and the relative binding affinity compared against other candidate ligands. This dissertation collects my contributions in several aspects of molecular docking. My early contribution focused on developing a novel metric to quantify the structural similarity between two protein-ligand complexes. Benchmarks show that my metric addressed several issues associated with the conventional metric. Furthermore, I extended the functionality of this metric to cross different systems, effectively utilizing the data at the proteome level. After developing the novel metric, I formulated a scoring function that can extract the biological information of the complex, integrate it with the physics components, and finally enhance the performance. Through collaboration, I implemented my model into an ultra-fast, adaptive program, which can take advantage of a range of modern parallel architectures and handle the demanding data processing tasks in large scale molecular docking applications

    Analysis of Biological Screening Data and Molecular Selectivity Profiles Using Fingerprints and Mapping Algorithms

    Get PDF
    The identification of promising drug candidates is a major milestone in the early stages of drug discovery and design. Among the properties that have to be optimized before a drug candidate is admitted to clinical testing, potency and target selectivity are of great interest and can be addressed very early. Unfortunately, optimization–relevant knowledge is often limited, and the analysis of noisy and heterogeneous biological screening data with standard methods like QSAR is hardly feasible. Furthermore, the identification of compounds displaying different selectivity patterns against related targets is a prerequisite for chemical genetics and genomics applications, allowing to specifically interfere with functions of individual members of protein families. In this thesis it is shown that computational methods based on molecular similarity are suitable tools for the analysis of compound potency and target selectivity. Originally developed to facilitate the efficient discovery of active compounds by means of virtual screening of compound libraries, these ligand–based approaches assume that similar molecules are likely to exhibit similar properties and biological activities based on the similarity property principle. Given their holistic approach to molecular similarity analysis, ligand–based virtual screening methods can be applied when little or no structure– activity information is available and do not require the knowledge of the target structure. The methods under investigation cover a wide methodological spectrum and only rely on properties derived from one– and two–dimensional molecular representations, which renders them particularly useful for handling large compound libraries. Using biological screening data, these virtual screening methods are shown to be able to extrapolate from experimental data and preferentially detect potent compounds. Subsequently, extensive benchmark calculations prove that existing 2D molecular fingerprints and dynamic mapping algorithms are suitable tools for the distinction between compounds with differential selectivity profiles. Finally, an advanced dynamic mapping algorithm is introduced that is able to generate target–selective chemical reference spaces by adaptively identifying most–discriminative molecular properties from a set of active compounds. These reference spaces are shown to be of great value for the generation of predictive target–selectivity models by screening a biologically annotated compound library. </p
    • …
    corecore