3 research outputs found

    Feature Reduction for Molecular Similarity Searching Based on Autoencoder Deep Learning

    Get PDF
    The concept of molecular similarity has been commonly used in rational drug design, where structurally similar molecules are examined in molecular databases to retrieve function-ally similar molecules. The most used conventional similarity methods used two-dimensional (2D) fingerprints to evaluate the similarity of molecules towards a target query. However, these descriptors include redundant and irrelevant features that might impact the performance of similarity searching methods. Thus, this study proposed a new approach for identifying the important features of molecules in chemical datasets based on the representation of the molecular features using Autoencoder (AE), with the aim of removing irrelevant and redundant features. The proposed approach experimented using the MDL Data Drug Report standard dataset (MDDR). Based on experimental findings, the proposed approach performed better than several existing benchmark similarity methods such as Tanimoto Similarity Method (TAN), Adapted Similarity Measure of Text Processing (ASMTP), and Quantum-Based Similarity Method (SQB). The results demonstrated that the performance achieved by the proposed approach has proven to be superior, particularly with the use of structurally heterogeneous datasets, where it yielded improved results compared to other previously used methods with the similar goal of improving molecular similarity searching

    Molecular similarity searching based on deep learning for feature reduction

    Get PDF
    The concept of molecular similarity has been widely used in rational drug design, where structurally similar molecules are explored in molecular databases for retrieving functionally similar molecules. The most used conventional similarity methods are two-dimensional (2D) fingerprints to evaluate the similarity of molecules towards a target query. However, these descriptors include redundant and irrelevant features that might impact the effectiveness of similarity searching methods. Moreover, the majority of existing similarity searching methods often disregard the importance of some features over others and assume all features are equally important. Thus, this study proposed three approaches for identifying the important features of molecules in chemical datasets. The first approach was based on the representation of the molecular features using Autoencoder (AE), which removes irrelevant and redundant features. The second approach was the feature selection model based on Deep Belief Networks (DBN), which are used to select only the important features. In this approach, the DBN is used to find subset of features that represent the important ones. The third approach was conducted to include descriptors that complement to each other. Different important features from many descriptors were filtered through DBN and combined to form a new descriptor used for molecular similarity searching. The proposed approaches were experimented on the MDL Data Drug Report standard dataset (MDDR). Based on the test results, the three proposed approaches overcame some of the existing benchmark similarity methods, such as Bayesian Inference Networks (BIN), Tanimoto Similarity Method (TAN), Adapted Similarity Measure of Text Processing (ASMTP) and Quantum-Based Similarity Method (SQB). The results showed that the performance of the three proposed approaches proved to be better in term of average recall values, especially with the use of structurally heterogeneous datasets that could produce results than other methods used previously to improve molecular similarity searching

    Bioactivity prediction using convolutional neural network

    No full text
    According to the similar property principle, structurally similar compounds exhibit very similar properties as well as similar biological activities. Many researchers have applied this principle to discover novel drugs, thereby leading to the emergence of the prediction of the activities of compounds based on their chemical structure, since the toxic or biological properties of compounds are determined by their chemical structure, particularly, their substructures. The concept of functional groups (FGs) of connected atoms (small molecules) determining the properties and reactivity of the parent molecule forms the cornerstone of organic chemistry, medicinal chemistry, toxicity assessments and QSAR. This study introduced a novel predictive system, i.e., a convolutional neural network that enables the prediction of molecular bioac-tivities using a novel molecular matrix representation. The number of atoms in small molecules were investigated to determine its accuracy during the prediction of the activities of the orphan compounds. This approach was applied to popular datasets and the performance of this system was compared with three other classical ML algorithms. All the experiments indicated that the proposed model was able to provide an interesting prediction rate (accuracy of 90.21)
    corecore